Slashdot Mirror


The Mac, Metadata, and the World

Rick Zeman writes: "ArsTechnica has posted yet another compelling article, this time on metadata, its history and the future of metadata storage as seemingly indicated by Apple in OS X. Extensions==Bad!"

307 comments

  1. Metadata by Tyler+Eaves · · Score: 1

    Metadata: A PITA for all real users!

    --
    TODO: Something witty here...
  2. File Type = Immutable ? by ender_ · · Score: 1

    The author correalates changing the extension of a filename to the changing of the file size. I often run across files on some macintosh machines that I maintain. If I didn't have the ability to change the file type I wouldn't be able to view or edit the data. Being able to readily change the viewer/filetype lets people read things the computer doesn't know how to handle.

    --
    Bzzt Whir Click
    1. Re:File Type = Immutable ? by Visigothe · · Score: 1

      No this isn't what the article is stating... The article said that "File Type" can be changed to reflect *accuracy* ie, if your file *is* a JPEG, but for some reason, the meta-data associated with the file tags it as a TIFF, you can change the meta-data to the more accurate JPEG, without affecting the actual encoding of the file itself.

    2. Re:File Type = Immutable ? by Anonymous Coward · · Score: 0

      That's part of the point ... the reason you sometimes have to "re-brand" a file with its proper file type is because its so easy for a file to lose that information as it travels from system to system. The author is suggesting that other systems should start tracking that important bit of metadata, and that Mac OS shouldn't stop tracking it.

      Of course you will always be able to fix one that's wrong, just as you can have an incorrect system clock and make a file with a date of 1904, and then change the creation date later in order to correct it.

    3. Re:File Type = Immutable ? by rabidcow · · Score: 1

      There is still a problem with considering the file type as immutable. Some data can be interpreted as a text file, a C source file, a perl script, etc. There are times when the VERY SAME DATA can be interpreted in different ways, and it's not just a matter of accuracy.

      Take a comma delimited list of numbers for example. It could be interpreted as spreadsheet data. It could be interpreted as a raw list of pixel shades or elevations for a terrain. It might actually be a strangely encoded file of some other type, but all of these interpretations of the data can be useful and to say that this is just a matter of accuracy is misleading at best.

      This may only happen in very strange cases, but it's not strictly immutable.

    4. Re:File Type = Immutable ? by Anonymous Coward · · Score: 0

      I have found this to be a very common problem when I worked with mac. It's very often, that files get orphaned and the file type is lost, and it is impossible to get the programs to open the file, because they believe that it's another program's file. The only solution would be (in the past that is to say) to go into ResEdit and manually change the file type, which is a bit of a hassle. I like it better with extensions, although there's too many program suites that greedily suck up a number of filetypes as their own, especially MusicMatch or RealAudio in Windows.

    5. Re:File Type = Immutable ? by Anonymous Coward · · Score: 0

      > It's very often, that files get orphaned and
      >the file type is lost, and it is impossible to
      >get the programs to open the file, because
      >they believe that it's another program's file

      unlikely. you're mixing something up here. a mac file has a type and a creator code. if a file gets orphaned, the creator can't be found. the type is still valid, and if you drag the file on a program that can handle the type of that file, there's no problem about the creator code being different.

    6. Re:File Type = Immutable ? by smack.addict · · Score: 2
      A text file is an unstructured document with character data. A C source file is a structured document containing C instructions. A Perl document ...


      A comma delimited list of numbers is a comma delimited list of numbers in accordance with some pre-defined structuring rules.


      The file type is immutable. How we use the files may differ on context.

    7. Re:File Type = Immutable ? by John+Siracusa · · Score: 1
      There is still a problem with considering the file type as immutable. Some data can be interpreted as a text file, a C source file, a perl script, etc. There are times when the VERY SAME DATA can be interpreted in different ways, and it's not just a matter of accuracy.

      As I posted earlier, I think it is. What you really want to say is that a file is simultaneously several types. This is simply a matter of accuracy, just as indicating that a file is both "text" and "HTML" is an increase in accuracy over just calling it "text." That fact that "text" and "HTML" have parent-->child relationship instead of a sibling relationship (as may be the case in some file that is both valid "Spreadsheet data" and "Address book data", for example) does not change the fact that it is merely an increase in the accuracy/resolution of the file type data.

  3. Linux? by interiot · · Score: 4, Interesting
    The glaring message I got from this was: Windows implements file type metadata quite badly.

    And the glaring question was: why is Linux blindly following Windows? Linux's file type handling is still in a somewhat early stage, it wouldn't be inconceivable for the paradigm to change.

    1. Re:Linux? by David+Roundy · · Score: 2, Interesting

      Because linux is based on unix, and has blindly followed unix, not windows. Also, linux supports a vast number of file systems, which means that the metadata would have to either be stored on all those file systems, or the OS would have to be able to live without it, which would probably lead to it being ignored by most of the software.

      Unless metadata is implemented consistently, its use can do more harm than good. ("I copied this jpeg (named 'picture') to my windows partition and back again, and now I can't view it!")

    2. Re:Linux? by Anonymous Coward · · Score: 0

      Linux (and other Unix-ish operating systems) follow Unix' file type metadata standards. That is to say, there are none.

      I think metadata would be a great thing, especially if done with something like MIME types where there's an open standard and updates are easy to obtain. Then, the file type really COULD be stored as a 32bit (or 64bit) integer in filesystem metadata, with lookups and mapping happening only as needed.

      Even better would be to conglomerate all these ideas and run a filesystem "view" from a relational database. Exporting a UFS filesystem "view" via NFS, while remapping filenames on the fly (cache for better performance) would be great, and would allow for the best of all worlds, plus easy manipulation and adaptation of existing file data to suit new purposes.

    3. Re:Linux? by Remote · · Score: 2, Insightful

      I don't think Windows does a bad job at storing data type information. It just doesn't try to. What Windows stores in the filename is file format information. A song tablature, ASCII art and C++ source code are very different things, but you can call them all TXT's and operate on them with no problem at all. The author really messes up things a bit in this matter. You can have, say, two LZW-compressed palleted images. One as a GIF and other as a TIFF. Pretty much the same data type, but with different headers/tags, different LZW max. prefix length, maybe different byte-order. Same for a JPEG TIFF and a JIFF. Actually, what is the point in saying Image/gif one you can't have Sound/gif or Text/gif?

      I really don't think Apple came up with an extensionless filename scheme purely out of conceptual considerations. Anyone who has ever tried to educated someone on how to use a computer for the first time knows that file extension can be confusing! The Mac was built to be easy. I would go as far as to say it was built to reach people who were afraid of computers. The fact is that some other people do need a command prompt, and that interface does benefit from file extensions.

      Now, Linux is not following Windows at all on this. Fire up Konkeror and see how it identifies most files, extension notwithstanding. Or try #man file.

      But, what do I know?

    4. Re:Linux? by Jeffrey+Baker · · Score: 2

      Linux programs discover the type of the file by looking at the file's contents (ref: file(1)). I think this is an obvious and straight-forward way to determine file type, and is therefore not prone to implementation bugs. The mapping of file type to program is handled by application environments like GNOME and KDE. Nautilus determines a file's type and offers a number of programs useful or manipulating the file. I think it works great.

    5. Re:Linux? by AndrewHowe · · Score: 2

      Well you say it works for you, so I guess I shouldn't disagree with you. But IMHO it's not a good way to do things. A given program should be able to determine if a given file is of its own type. However, when you're given a file and you have to determine which of a thousand applications it belongs to, that's a whole world of pain. If everyone agreed to put something in the same place in every file, say at the beginning, then it could work. But, it would be ugly, and... Hey, if it's in the same place for every file, why not just take it out of the file and associate it instead? Nautilus may well be able to determine file types, but it's not going to be efficient at it. For example if you throw it a big directory full of stuff it's going to have to scan arbitrary amounts of those files to work out what type they are, and I bet it's not 100% accurate either.

    6. Re:Linux? by Anonymous Coward · · Score: 0
      Linux's file type handling is still in a somewhat early stage, it wouldn't be inconceivable for the paradigm to change.

      I wasn't aware that filetype recognition was done at the kernel level. Thanks for the heads-up.

      that was sarcasm, folks

    7. Re:Linux? by izzertaq · · Score: 1

      file(1) looks at the contents of the file to determine what type of file it is.

    8. Re:Linux? by AndrewHowe · · Score: 2

      Yes, I know. I'm just saying it's not a good general solution to the problem (although it's not a bad idea at all as a last resort, instead of just giving up).

    9. Re:Linux? by Anonymous Coward · · Score: 0

      Mac OS X also looks inside files to try and discover what file type they are, but I think it is the third or fourth thing it does after the HFS file type and the filename extension. Not all files have this information in the header, though. Internet files usually do (PDF, JPEG, etc), but I don't know if a Microsoft Word document does.

      Still, I have gigabytes and gigabytes of files on hard disks, backup tapes, data DVD's and CD's and such that don't have filename extensions. Am I supposed to go through all of those disks and append ".text" and ".mpeg" to things? By the time I am done everyone else might have gotten their shit together and we'd be back to no filename extensions. Plus, most of those files are media files (AIFF audio, say) that get called from a main document, and you can't necessarily rename them all that easily without opening the master documents.

      Anyway, Apple doesn't have total control over this. Mac apps are showing up on Mac OS X and they're making filename extensions optional (with a sticky checkbox for "add filename extension" in the Save panel, for example). The OS has to deal with files that don't have extensions right now and it will have to keep dealing with them. You can't really go back because so many files are already made without extensions. All those hit records made with Pro Tools and the "master tapes" these days are files on an HFS+ volume ... the only filename extensions Pro Tools uses are ".L" and ".R" for each mono file in a split stereo pair.

    10. Re:Linux? by Anonymous Coward · · Score: 0

      You take a pretty big performance hit if you open a folder of 1000 files and the OS has to look into each one before it can display and icon. Much better just to have this important data stamped onto the file.

    11. Re:Linux? by cloudmaster · · Score: 2

      For example if you throw it a big directory full of stuff it's going to have to scan arbitrary amounts of those files to work out what type they are, and I bet it's not 100% accurate either.


      So, the metadata implementation suggested somehow frees the interface from scanning a large list of files? It's still gotta build a list of files, and it's still gotta look up metadata for each one. You'd have to have a darned big directory to have any signifigant difference there. Macs make up part of the systems that I admin, and they're certainly no joy to view directories with lots of files within - mostly because of the metadata (the B-Tree thing is pretty cool, though).

      As far as accuracy goes, using magic is pretty accurate, but you're right - it's not 100%. It allows a nice migration from older filesystems without having to touch the filesystem itself, though, and is fairly easy for graphical frontends to implement. Personally, I think the tradeoff in accuracy is a pretty good one, 'cause it allows the CLI to continue functioning cleanly.

    12. Re:Linux? by swb · · Score: 2

      Anyone who has ever tried to educated someone on how to use a computer for the first time knows that file extension can be confusing!

      Anyone who has ever had to work with a Mac user knows how confused they get by files with an extension (eg, .DOC) but with no creator and filetype information. They flail on the mouse button and then tell you that your document is damaged or that they can't open your TeachText document and oh by the way why did you do that spreadsheet in TeachText anyway? They simply cannot understand the idea of opening the application and then opening the document.

      Now I know that this can be interpreted as a virtue of the Mac OS because it's allowing you to focus on your "job" and not on your "computer", and maybe it is. But it also strikes me as a little self-defeating because the user doesn't ever get beyond the flail-on-the-mouse stage, which sounds to me like they're not getting much out of their computer.

    13. Re:Linux? by wik · · Score: 1

      Cygwin used to (and may still) do this on win32 filesystems to figure out unix-esq permissions. It can take forever, even on small directories (I have noticed significant delays on directories with a bunch multi-gigabyte files). All in all, it's an easy hack, but it hurts.

      --
      / \
      \ / ASCII ribbon campaign for peace
      x
      / \
    14. Re:Linux? by Raven667 · · Score: 2
      So, the metadata implementation suggested somehow frees the interface from scanning a large list of files? It's still gotta build a list of files, and it's still gotta look up metadata for each one. You'd have to have a darned big directory to have any signifigant difference there. Macs make up part of the systems that I admin, and they're certainly no joy to view directories with lots of files within - mostly because of the metadata (the B-Tree thing is pretty cool, though).

      I disagree with this point. In a filesystem that stored the file type (I prefer MIME) as part of the filesystem data the additional time required to read in the filetype will be imperceptable to the user. The file data would be read in at the same time as the mtime, permissions, etc. To run file(1) on each file would require open()ing each one, reading in the first hundred bytes or so and comparing it against a large list of magic numbers. This entire operation is merely overhead, if the Content-Type was stored as part of the file data it would already have been gotten by the time the system could even start file() scanning.

      As others have pointed out, BeOS did this right by making this part of the interface instead of like a Mac where the type/creator info is hidden from the user and not editable without downloading additional software.

      --
      -- Remember: Wherever you go, there you are!
    15. Re:Linux? by kijiki · · Score: 1

      Maybe its my database background speaking, but non-normalized (for the uninitiated read: redundantly stored, more or less) data storage is bad news. If you have redundant data, the different copies WILL eventually become out of sync.

      UNIX file (1) uses the contents to determine type. Any other form of meta-data is redundantly storing the type and the contents.

      Its a pain when I download a wav file on my mac and the OS thinks Netscape should open it. The solution is to not store type metadata redundantly where it can become out of sync with the real type (determined by the actual comments).

    16. Re:Linux? by Isomer · · Score: 1

      ext2fs has the ability (although I believe it's not implemented to store the first 96 odd bytes in the inode, effectively moving this information with the meta data. As the first 96 will contain its magic, and probably it's most important properties you get almost all the advantages :)

    17. Re:Linux? by CSC · · Score: 1
      They simply cannot understand the idea of opening the application and then opening the document.

      Now I know that this can be interpreted as a virtue of the Mac OS because it's allowing you to focus on your "job" and not on your "computer", and maybe it is. But it also strikes me as a little self-defeating because the user doesn't ever get beyond the flail-on-the-mouse stage, which sounds to me like they're not getting much out of their computer.

      What exactly do you mean by getting much out of their computer ? Is it grokking a lousy paradigm that happens to be the most common or knowing how it works inside, like a car engine ?

      Whatever, I think this is a self-centered geek view (nothing personal!)

      --
      -- Colin
    18. Re:Linux? by swb · · Score: 2

      What exactly do you mean by getting much out of their computer ? Is it grokking a lousy paradigm that happens to be the most common or knowing how it works inside, like a car engine ?

      Obviously there is more to your car than knowing how to turn the key and steering wheel and push on the gas and brakes, even if you don't work on your own engine. There's making sure your tires have air, knowing how much you can load/tow with it, and so on.

      Likewise, getting more out of a computer doesn't require being able to program it. The example I gave here is something I've experienced more than once in a business environment. I'd expect appliance-like ignorance of operation from granny sending email, but shouldn't an office worker be expected to know a little more in order to do their job effectively? There are obvious questions about lost opportunities for enhanced productivity when a user isn't capable of getting beyond point-and-click.

      The apocryphal if not true intent of Jobs was to make the Macintosh a computer appliance. The reality is that it isn't one by a long shot, and someone using it to get a task done on a daily basis should reasonably be expected to know more about how to make it accomplish their tasks.

      The argument I'd make is that certain Macintosh features like filetyping and creator typing which are hidden from the user also inhibit the user from understanding essential aspects of how their computer works, eg "Why won't my computer open my file as expected?"

      It's not about being a geek OR an expert, it's about being functional.

    19. Re:Linux? by topham · · Score: 2
      I've run into this many times with the graphics departments at several companies.


      The Mac users have no idea how to deal with PC files which they CAN read.

    20. Re:Linux? by Anonymous Coward · · Score: 0

      >Anyone who has ever had to work with a Mac
      >user knows how confused they get by files
      >with an extension (eg, .DOC) but with no
      >creator and filetype information.

      Do you actually realize how offensive you sound?

      Not all Mac-users are dumb, just as not all Linux-users are dumb and not all Windows-users are dumb. And the fact that DOS and Unix used a bad system for tracking file type doesn't mean that *all* systems need to use this same system so people have to learn it. It's not the Mac's fault that other systems don't implement a transparent file type system. If a Mac-user has problems with files from Windows-using people, that might actually be because *Windows'* system is broken, not because of the Mac.

    21. Re:Linux? by swb · · Score: 1

      It seems naive to belong to 5% of the computer world and not have any idea how the other 95% works.

      I'm not defending the value of Windows typing mechanism, either, I think its archaic. I'd prefer a file(1) type checking mechanism built into whatever passed for the operating system's Finder interface so that file type could be determined by the signature of the data itself.

      New applications with "unknown" file signatures could expand the system database of knowns signatures. The Finder interface could offer a mechanism to associate found types with applications and a way to change those associations or override the association between a found signature and a specific file type.

    22. Re:Linux? by Vulture_ · · Score: 1

      Maybe you should read the comments in the file(1) magic files. I find many types that are commented out because they conflict with some other format.

      The thing people seem to be forgetting is that file guesses the file type. It is not deterministic by any means. It can make mistakes. I don't like it when computers make mistakes. It's bad enough that human brains make mistakes; I don't need computers doing it too!

      File is a crude, brutal, ugly hack, and the fact that it's necessary signifies a deficiency in the Unix system -- the lack of strong, deterministic file typing.

      It never ceases to amaze me how the Mac is the only major system to ever implement this mode of file typing; all others depend on file magic to guess the file type based on its contents (stupid; prone to error) or file extensions (very stupid; a file's name and type should be distinctly separated from one another).

      It also never ceases to amaze me how there is always an outcry against filesystem metadata, and how people cite the concept's inflexibility. This inflexibility is a design flaw on Apple's part, and when constructing a new system, there is no such constraint. Allow me to demonstrate:

      $ chtype foo.html
      text/plain
      $ edit foo.html
      [ ... your text editor starts ... ]
      $ chtype foo.html text/html
      $ edit foo.html
      [ ... your HTML editor starts ... ]

      Now that wasn't so hard, was it? In the above example, a chtype command is used to view/change the type of a file, and edit is used to edit the file. Of course, there would be configuration files (~/.ftyperc) for choosing which applications should handle which types, what functionality they provide (view, edit, print, etc), and perhaps some other information.

      This is basically what Debian's mime-support package does, except that it uses file(1) instead of filesystem metadata. And it works very nicely.

      --

      The only way the typical /.er can pick up a chick is with a forklift. -- AC

    23. Re:Linux? by Anonymous Coward · · Score: 0

      >I'd prefer a file(1) type checking mechanism
      >built into whatever passed for the operating
      >system's Finder interface so that file type
      >could be determined by the signature of the
      >data itself.

      afaik, MacOS X does that. it's probably the reason why sorting by file type is so sloooow in MacOS X.

      >The Finder interface could offer a mechanism
      >to associate found types with applications and
      >a way to change those associations or override
      >the association between a found signature and
      >a specific file type.

      You can't associate some creator type with a different program, and you can't associate all files of some file type with one program, because the creator type determines what program opens the file, the file type just tells it what kind of file it is, and whether a program can actually open it. if you don't have a program for some file you try to open (ie no program for a given *creator* code), MacOS gives you a list of all programs that can open this file (ie that can read files that have the same *type* code).

      it's easy to write an applescript that changes all files on your hd of a given file type to have some creator type though.

    24. Re:Linux? by cloudmaster · · Score: 2

      I suppose I'll give you that, as long as it's implemented in a BeOS-like fashion and not MacOS-like. I really like Be's FS. It's a shame that a few key apps are missing for Be (compatible web browser?), or I'd still be using it...


      I don't see the problem in integrating a file-like functionality into the kernel/filesystem and grabbing the data from the file itself instead of having an extra piece of data stored somewhere for each file. The stat() routine gets modified (or renamed to stat2()) to return an extra piece of information, say, a hash code into a magic table. The magic table can easily be stored as a hash to reduce the lookup time to near-imperceptable, and this would still allow [well-written] "legacy" apps to continue functioning properly. Extra possibly out-of-sync file problem avoided, backwards-compatiblility preserved.

  4. Interesting... never thought about it before... by alexhmit01 · · Score: 2

    Very interesting. I never really thought about metadata before, but it brings up a lot of points about the mistake of using file extensions.

    File extensions do serve a convenient purpose with a command line, as you can manipulate them easier without using multiple tools. However, if the metadata was stored outside the filename, we could have (and had) UNIX, GNU, BSD, and DOS/Windows utilities to manage them in the past. If all systems were designed to keep track of the metadata, it would have been a better world.

    It is unfortunate that the technical lowest common denominator (DOS and DOS-based OSes) dictate so much of our system. While Windows NT based systems (including Win2K, WinXP, etc.) have made tremendous strides, there is a constant need to maintain compatibility that holds us back.

    I think that it makes sense for Apple to adopt the file extensions, as unpleasant as they are, to support a networked world. The author's suggestion of adding them on transport makes sense, but definitely leaves something to be desired. It would be confusing to transfer Word documents around and have the extensions pop on and off depending on the environment. If the Mac leaves them alone, it still leaves something to be desired because the file name changed when it left the Mac it was created on for the file server, and when it comes back it has a different name.

    It's a shame that a standard for storing the metadata wasn't created long ago. While the PCs wouldn't use the data, it is a shame to lose it. It is also a shame that we have to work towards the lowest common denominator. It's one thing to support it, it is another to adopt the conventions.

    Alex

    1. Re:Interesting... never thought about it before... by jhines · · Score: 1

      File extensions are wonderful, they are for the HUMAN to understand. They tell the human what is needed in an easy to understand format. Putting .wks on a file tells most people exactly what is needed, and the program itself can figure out the details from the file itself, or the metadata.

      The computer itself shouldn't use the extension for anything but hints for filling in unkowns with default values, according to a the users conventions.

    2. Re:Interesting... never thought about it before... by Anonymous Coward · · Score: 0
      I agree with you - that article probably was interesting if you've never thought about metadata before.


      Unfortunately if - like me - you have spent time thinking about metadata, the article was boring and an insult to the intelligene - and also wrong on a number of counts.


      For instance, the .txt extension isn't filetype metadata. It's filename metadata interpreted as a filetype by programs. DOS never had a concept of filetype. Similarly, .tar.gz isn't filetype metadata either.


      Secondly, filetype is not immutable. Programs might quite reasonably want to load a file as binary, or as text, even if it is "HTML".


      Thirdly, he makes only one small allusion to the actual future of metadata - when he says "I'm ignoring object repositories".

    3. Re:Interesting... never thought about it before... by AndrewHowe · · Score: 2

      Well, I feel kind of in the middle, because I had thought about it before (but not recently) but I still thought it was interesting.
      If I might address your second point, the author did go to some trouble to stress that policies are distinct from the metadata itself. Opening an HTML document as binary doesn't change the fact that it's an HTML document.

    4. Re:Interesting... never thought about it before... by Anonymous Coward · · Score: 0

      I have no idea what ".wks" means on the end of a file, and I have been using Windows, Mac, and Unix computers since about 1988, and have even written a number of computer books. How are the 50% of Americans and the 70% of the world that have never used a computer yet supposed to deal with ".wks"? Do you know what an ".all" file is? I use them all day, although I don't have to use the filename extension unless I want to.

      Also, I can tell you for a fact that ".cpt" is both a Corel Photo-Paint document and a Windows Control Panel. How many more things will it be in a few years?

      Another thing that's hard for new users is ".html", ".mpeg", ".jpeg" being distorted into ".htm", ".mpg", and ".jpg" on Windows. Worse than that, on Windows, ".html" and ".htm" can be two different file type entries.

      Another drag is that it is common on Windows to install an app like RealPlayer and then all of your media files actually change in their appearance and function, which is a huge drag if you are a content creator, and all of your work suddenly seems to have been modified. That's a huge, huge bug. Real and Microsoft have actually been to Federal court over this "stealing of file types". It couldn't happen on Mac OS. Documents keep their file type and creator information within themselves, instead of having a central database in the OS just guess what app originally created the document.

      > File extensions are wonderful, they are for
      > the HUMAN to understand.

      I think the icons that Mac OS uses in place of filename extensions are easier for the human to understand. A little Word icon stamped on a piece of paper with writing on it identifies the file as a Word document much more readily than just ".doc" being on the end. It is much more intuitive for the new user, that's for sure.

      There's no way you would use the filename extensions system if you were starting from scratch. It just became a custom for bearded Unix geeks to stick the file type on the end of the file since there were no icons. At this point, it's time to put that info in a field of its own. Think about how much time and energy is being put out by the user instead of the computer when people rename files and have to think of the right extension to put back on there.

      I am praying that "hidden" filename extensions in Mac OS X 10.1 aren't affected when you rename the file. In other words, if "File.doc" is shown as "File" and I rename it to "Foo", I just want the file to be renamed "Foo.doc". I never understood why Windows would pop up and ask you if you wanted to change the extension when all you did was change "File" to "Foo". How many times would you really do that? And if you were, then surely you would unhide the filename extensions?

    5. Re:Interesting... never thought about it before... by Anonymous Coward · · Score: 0

      .cpl is Control Panel, not .cpt.

  5. Take Advantage of Apple's Decisions by Anonymous Coward · · Score: 0

    I wouldn't put it past MS if they were to take advantage of the controversy in the Mac community about the addition of file extensions in Mac OSX by getting rid of them altogether in future versions of Windows/NTFS. One of MS's true strengths lies in implementing others' (usually dated) ideas and standards in a new package with a shiny new name and/or acronym and stunning the world with a breathtaking new direction for Windows. I can see the articles now: "Windows KY Drops Need for File Extensions, Greater Control of File Naming."

  6. Re: Forks and roads by benedict · · Score: 2

    Someone didn't read the article. Apple stored the file type and creator data in a dedicated metadata storage area (analogous to an inode), *not* in the resource fork. Siracusa stressed this point *several* times.

    --
    Ben "You have your mind on computers, it seems."
  7. Implementation difficulties by tbo · · Score: 2

    I think the reason Apple went to a dual file type system (extensions and metadata) is because it's too hard to implement the necessary level of interoperability otherwise. Suppose you want to keep the usual Classic MacOS method of just having file type/creator code metadata. You merrily store your files on your hard drive, with no extensions in sight.

    Now you share your drive on a heterogenous network. A Win98 box connects, and looks at your files via NFS. Does the MacOS X-side NFS server automatically translate the filenames and add extensions?

    Then another MacOS X user uses ssh to connect to your box. He types 'ls'. Does he see "virtual" extensions or not? What if it's a Windows user telnetting in? How would your box even know what OS the remote user was coming from?

    It's just too easy to run into inconsistencies if you stick with the system of mapping file name extensions. Yes, extensions annoy me, especially since older versions of MacOS are stuck with a 31 character limit (yes, it's 31, not 32--Ars is wrong), and I have to keep file names short to be backwards compatible. Unfortunately, it's just another bad MS decision we have to live with.

    1. Re:Implementation difficulties by Snocone · · Score: 2

      (yes, it's 31, not 32--Ars is wrong)

      A 31-character Pascal string is 32 bytes long.

    2. Re:Implementation difficulties by megaduck · · Score: 2

      Bingo. Well said

      --
      This .sig for rent.
    3. Re:Implementation difficulties by tbo · · Score: 1

      Yes, and that's the reason the Mac has 31-character filename limits. Having 33-byte strings would have been annoying. Ars said the file names were limited to 32 characters, and I corrected them by saying 31. If they'd said filename strings, they would have been right.

    4. Re:Implementation difficulties by John+Siracusa · · Score: 1
      Ars said the file names were limited to 32 characters, and I corrected them by saying 31.

      Where does the article say that? You're right about the length, and I'd like to correct it if I made that mistake. But I can't find any mention of 32-character file name limits in the article.

  8. Mac user's problems by Anonymous Coward · · Score: 0

    Metadata in Mac OS is nifty and Microsoft tried to mimic lots of the features in Mac OS from Windows 9x onwards (associating file extensions with types and hence applications).

    However, the article still seems to be just some whining Mac user who fails to understand that MOST systems in use today do not have explicit metadata describing a file's type (UNIX doesn't, do things like VMS have such metadata in the filesystem?).

    In fact, for proper design, a filesystem should only be concerned with the minimum features necessary to store and retrieve data. It is up to the application and user to determine whether it makes sense to use the data in some contexts. For example, you can name files anything you want, extension or not in unix. it is up to the application to determine whether it can extract meaningful data from the file.

    Oner problem with Mac OS 8.1 I have noticed is that the user cannot specify DIRECTLY which application handles a file type: it seems the application just registers itself automatically. When the application is removed/upgraded however, the associations are not updated (probably because this version of Mac OS does not really have an install/uninssall mechanism). How useful is that?

    The artcile was so damn long I only read a few parts, so feel free to point out any problems.

    1. Re:Mac user's problems by David+Roundy · · Score: 1
      In fact, for proper design, a filesystem should only be concerned with the minimum features necessary to store and retrieve data.

      On the contrary, that would mean that the file system shouldn't support file ownership, permissions or dates. None of these are strictly necesary, but they certainly can be darn convenien, and I definitely would rather be using a file system that supports them.

      In fact, heck, we don't even need to have the file system store the filename! Why not simply store it in a block at the beginning of a file? If you want to be POSIX compliant, the operating system could strip it off before passing the data to an application, so it really wouldn't cause a problem, would it?

      A file system should support all the metadata which can most conveniently be dealt with at the file system level. True, every file could have a date modified section, which each application updates every time it modifies the file, but this would be inconvenient and error prone, since any application could easily mess it up. By supporting this at the file system level, the modification date is considerably more reliable.

      File types are not as straightforward as modification dates, since without application support the OS has no way of knowing what the file type is. On the other hand, file type is very useful information, useful enough that most programs want to mess up my filenames with it. If they're going to go to the trouble of doing that, they can certainly go to the trouble of defining a file type.

    2. Re:Mac user's problems by Anonymous Coward · · Score: 0

      >Oner problem with Mac OS 8.1 I have noticed is
      >that the user cannot specify DIRECTLY which
      >application handles a file type: it seems the application
      >just registers itself automatically. When the application
      >is removed/upgraded however, the associations are not
      >updated (probably because this version of Mac OS does
      >not really have an install/uninssall mechanism).
      >How useful is that?

      you don't understand the system: there are two types, a file type and a creator type. applications don't "register" themselves for a file type. files are assoziated to programs themselves with their creator type.

  9. Slashdot 2.0 got smackdotted by ScooterComputer · · Score: 0, Offtopic

    The new code is REALLY broken, as nearly 100 posts are missing from this article now...

    --
    Scott
    "Hokey religions and ancient weapons are no match for a good blaster at your side, kid."
  10. Fallacies in Fundamentals by Anonymous Coward · · Score: 1, Insightful

    So far, there are at least 3 fallacies in the "Fundamentals" section:

    1) A file's size is not metadata: A file can best be defined as an ordered set of bytes (or bits, or words, or whatever atomic unit your system uses), and the size of that set is intrinsic to it, not external.

    2) A file's modification time is conceptually unrelated to its contents. For example, most systems consider a file "modified" even when its contents are replaced by totally identical contents, and some systems provide means to change a file's contents without changing its modification time. Generally, systems use the modification time to note the time of an action that the user would see as causing a file to be modified, which is not always the same thing as noting the time that a file's content are actually changed. I know of no system that records the later time.

    3) A file's type can change at will, not just to increase or decrease the "accuracy" of the typing. It's rare that a file would be useful when viewed as data of two or more independant data types, but there's nothing intrinsic in the concepts of files, their types, or metadata, to prevent this. Thus, for example, hacker can get some perverse enjoyment from writing source code that works simultaneously in multiple programming languages.

    In general, the author's categorization of metadata into "immutable" and "mutable" is nonsensical. File metadata, by definition, is independent of file data, and is therefore mutable independantly of it. Sometimes systems create tighter links between metadata and data, for example when Photoshop causes files created with it to be of a certain type, or when users makes sure the names of files important to them are in uppercase, but that's a characteristic of the system (Photoshop or user conventions in these examples), not an intrinsic characteristic of data and metadata... And in the introduction, the author warns against reading the "Fundamentals" section with an eye on system implementations :-).

    I'm going to guess the author reaching beyond logic to make this categorization so as to give file typing a role distinct and more important than file naming. Needless to say, this is counter-productive.

    1. Re:Fallacies in Fundamentals by John+Siracusa · · Score: 1
      1) A file's size is not metadata: A file can best be defined as an ordered set of bytes (or bits, or words, or whatever atomic unit your system uses), and the size of that set is intrinsic to it, not external.

      Yes, it is "intrinsic", but it's also metadata. Think of a person's gender. It is intrinsic, but it is also information about that person.

      2) A file's modification time is conceptually unrelated to its contents.

      It depends on the implementation, as you point out.

      Generally, systems use the modification time to note the time of an action that the user would see as causing a file to be modified, which is not always the same thing as noting the time that a file's content are actually changed. I know of no system that records the later time.

      The existence of implementations that behave in the way described is irrelevant in the fundamentals section. There's no reasons that the modification date semantics described couldn't exist.

      3) A file's type can change at will, not just to increase or decrease the "accuracy" of the typing. It's rare that a file would be useful when viewed as data of two or more independant data types, but there's nothing intrinsic in the concepts of files, their types, or metadata, to prevent this. Thus, for example, hacker can get some perverse enjoyment from writing source code that works simultaneously in multiple programming languages.

      Such a change is a case of trying to increase accuracy while being constrained by an implementation that assumes files can have only one type at a time. What you want to do is say that file X is of type A, B, and C. But since you can only say it's of type A, you may "change" it to be of type B. But what you're really, conceptually, doing is trying to increase the accuracy of file type information by indicating that the file is of both types.

      As you noted, such situations are rare, but handling them does not negate any of the fundamentals, IMO.

      In general, the author's categorization of metadata into "immutable" and "mutable" is nonsensical. File metadata, by definition, is independent of file data, and is therefore mutable independantly of it.

      Your definition of metadata obviously differs from mine.

    2. Re:Fallacies in Fundamentals by serial+frame · · Score: 1
      Ever think about how files are handled by a web browser? Web server? Well, although it's the webserver's job to determine a MIME type to put into the HTTP header response when sending a file, whether it be by determining the MIME type by extension or linguistic features (with file(1)), the web browser has to deal with it somehow.

      Moot point, but it's something to think about. For those webservers that don't give a MIME type other than application/octet-stream for something like a tarball or a zip archive, it's left completely up to the browser to figure out what to do with the file--without any other data. In the case of Netscape Communicator on Windows, extensions come into play.

      Whilst metadata in a file (or as a separate part of a file on a filesystem) is useful, it only goes so far. Extensions are still the most widely accepted way of determining file types--And yes, just like everything else, it has some security holes (double-extensions), but is generally a fault-proof way of doing things.

      (P.S. Quite obtrusive to the Mac newbie, PC Exchange is very good about handling metadata when transferring files via VFAT/MSDOS disks. Just needs a bit of configuration and patience.)

      --

      -
      And the Angel said unto me, "These are the cries of the carrots! The cries of the carrots!"
    3. Re:Fallacies in Fundamentals by Anonymous Coward · · Score: 0

      Your definition of metadata obviously differs from mine.

      I don't think so. Metadata is, as you state in the introduction to your article, a simple enough concept:

      Would you not agree that metadata is information about a file that isn't that file's data? You yourself state in the article that "metadata [...] is distinct from the data itself" and "metadata [...] is often difficult (or impossible) to add once it is lost" (suggesting that metadata isn't even derivable from the file's data).

      Since we seem to agree on the definition of metadata relative to data, let's see whether we agree on the definition of data: You state in the article that "Without any metadata, this file is just data: a bunch of bits". This seems to agree with the definition in my post of data as "an ordered set of bytes (or bits [...])". Note that a set (or a "bunch") contains information about the value of the bits/bytes in the data, the order in which they appear, and also their number (the size of the data).

      About whether a file's size is metadata, you wrote:

      Yes, it is "intrinsic", but it's also metadata.

      Sorry, I don't think that's logical. Clearly, with the above definitions, a file's size is part of the file's data, so it's intrinsic. According to my definition and your quotes above, that disqualifies it as metadata. This meshes with the fact that (barring technical malfunctions) file size is always preserved when files transit through systems, even when those systems don't support any type of metadata.

      About file type changing and "accuracy":

      But what you're really, conceptually, doing is trying to increase the accuracy of file type information by indicating that the file is of both types.

      Granted... Contrarily to what I argued, changing a file's type can indeed be seen as always increasing or reducing "precision". I withdraw that fallacy claim :-)

      About whether file modification times are always tied to their content:

      There's no reasons that the modification date semantics described couldn't exist.

      Sure, but there's no reason that it should. Since implementation is irrelevant in the Fundamentals section, it makes no sense to pick a particular implementation as an argument to tie modification time to file contents and so come up with an "immutable" categorization of metadata.

      In general, I stand by my claim that there's no purpose in cobbling together special categories of metadata. File typing is special because it has a special effect on users, not because it's "immutable", "essential", "independent", or anything like that.

      This being said, apart from what I consider to be excessive and incorrect categorization in the 2 first sections, I found this article excellent, particularly in its dissociation of metadata storage and usage policies. I still disagree with the conclusions, but the reasons for this disagreement are best presented in other posts :-).

    4. Re:Fallacies in Fundamentals by John+Siracusa · · Score: 1
      You yourself state in the article that "metadata [...] is distinct from the data itself" and "metadata [...] is often difficult (or impossible) to add once it is lost" (suggesting that metadata isn't even derivable from the file's data).

      Where did I say that metadata isn't derivable from the data? I said that some metadata, like size and type, is actually directly tied to the data. And I said that "metadata [...] is often difficult (or impossible) to add once it is lost", not "always."

      a file's size is part of the file's data

      Tell me where in a file's data I can read the file size. It is intrinsic (just as type is), but it is still metadata.

      file size is always preserved when files transit through systems, even when those systems don't support any type of metadata.

      They support essential metadata: metadata that is needed to access the file like name, location, and size.

      Since implementation is irrelevant in the Fundamentals section, it makes no sense to pick a particular implementation as an argument to tie modification time to file contents and so come up with an "immutable" categorization of metadata.

      I had to pick semantics for all the example metadata in the fundamentals section. It just so happens that modification date semantics vary more than the others in the real world, which is what's leading you into implementation thoughts.

      In general, I stand by my claim that there's no purpose in cobbling together special categories of metadata.

      I think there are clear categories, regardless of what you choose call them.

    5. Re:Fallacies in Fundamentals by Anonymous Coward · · Score: 0

      > A file's type can change at will

      no. it can't. a text file is always a text file, whether it contains C or "the lord of the rings". you'll never want to change the type of a text file. the file type and the file extension are something *different*! mac-users use extensions too. BBedit knows whether I open a php-file or some random text file, because the php-file ends in .php, while the text-file doesn't have a file extension. yet bbedit *knows* it can read both of them, because they're both text-files. and both files can be associated to bbedit, because they have to correct creator code, too.

  11. BeOS has already SOLVED the FileType/Metadata prob by Eugenia+Loli · · Score: 2, Informative
    I suggest to all read here:

    http://www.beosbible.com/exc_filetype.html
    and here:
    http://www.beosbible.com/exc_query.html

    The BeOS has solved the problem, years ago. The BFS has integrated all these features into the OS itself, so all applications are making use of them. The Byte.com BeOS articles from Scot Hacker are also a must read!

  12. Filetype metadata should be in-band. by zeda · · Score: 2, Insightful

    Unix pipes. How else are you going to get file type metadata if it isn't in-band. That is what the magic number is all about. Pipes, stdin, stdout, etc.

    I think this is purely an application level problem and and not a filesystem problem.

    It still matters in the gui world too. If we ever develop GUI drag and drop style graphics filters and such, say a webcam output into a filter into something else, that info is still in-band.

    How would you represent the file type of a named pipe, or a socket?

    1. Re:Filetype metadata should be in-band. by Anonymous Coward · · Score: 0

      Are you saying that because on Unix, all files are essentially just a sequence of bytes, and can be "poured" down a pipe, that this is a reasonable design restriction? There are some operating systems for which the file system has native support for indexed sequential files, direct access files, and even more structured "files" such as relational databases or arbitrary objects. Examples would OS/400, VMS, OS/390. A file system which simulates a paper tape reader and punch doesn't seem like much more than a lowest common denominator starting point.

    2. Re:Filetype metadata should be in-band. by Anonymous Coward · · Score: 0

      Well, that's the trouble with UNIX-heads. Lowest common denominator is all they are capable of thinking about. And they are so far out of it, they don't even know it. They actually believe their way is the best way!

    3. Re:Filetype metadata should be in-band. by zeda · · Score: 1

      It's about the "everything is a file" concept. File semantics should be independent from the file system, because of networking and other things.

    4. Re:Filetype metadata should be in-band. by Anonymous Coward · · Score: 0
      The "everything is a file" concept is bullshit and doesn't even apply anymore. File types in most versions of Unix already ARE implemented in metadata, just in a very limited way. That's how the OS knows whether a file is a pipe, or a device, or a directory (yes, directories are files too!), or a soft/hard link, or just a plain file. If that stuff isn't in metadata, how in the world would the OS know what kind of file it was? Do you really think that pipe files just happen to have some content in them that says something along the lines of, "hey, OS, I am a pipe and you should redirect your input and output to process PID"?

      This is all just the logical extension of that thinking. It's saying, "well hey, if we already tell the OS that these files are of these types in metadata, why can't we say all these other files are of all these other types too? Maybe even do it with MIME types for great internet interoperability since HTTP servers and email clients all already use it!" I'm sure that is exactly what passed through the folks at Be's heads when they wrote BeFS which is a wonderful file system.

      Think, dude. Think!

  13. Don't get me wrong but..... by jrq · · Score: 1

    In principle I dislike extensions as much as the next man, but Operating Systems everywhere manage to do a repeatedly bad job of managing the resource side of things, yes I'm talking about Window's associations and Mac resource forks (I became quite popular, at an office, some years ago, providing utilities which would strip the first 128 bits out of Mac generated Photoshop and Illustrator files).

    The author of this piece even identifies the horror of allowing OSs to hide the extensions (one of the many things that gets fixed when working on a Windows machine) how could the possibility of allowing two files, in the same folder, to have the same name be acceptable, EVER!.

    If there was a standard, say header, section required by all files this would be fine, but this is obviously OS dependant, remember most of that other metadata, creation date, etc etc is all stored in the FAT on most OSs. A world without extensions means that all file access would need to be pre-processed so that the correct application could subsequently be applied. Opening a file is, last time I checked, more of an overhead than examining an extension. And then what? the application police move in, preventing access to files that haven't been created in the right application?

    I want more metadata about files, I want to get useful, searchable information, perhaps the real place to put it is in the file itself, like so many applications already do. Taking the responsibility away from the filename and putting it in the hands of the operating system, for encoding and decoding this metadata is fine as long as the OS doesn't break, lose the key, and remembers to enforce gatekeeping functions so that when file goes off to play in the big wide world it doesn't drag along any of that OS specific data with it.

    --
    My UID is prime!
  14. Before another non-insightful mac zealot posts.... by Anonymous Coward · · Score: 0
    Yes, you can associate multiple file actions with a single file type in Windows, and actually customize it, or rename to file to reassociate it with a totally different type, *.txt to *.csv, for example. You can select the non-default option by RIGHT CLICKING on the file. There's also the built-in 'send to' and 'open with' for unassociated types. I've even written quick little batch files that search for specific strings and run a particular app based on the file's contents.


    How do you change the file type attribute on a mac? ;)

  15. File Extensions? by Auckerman · · Score: 2
    He's spents a lot of energy attacking Apple's reccomendation that file name extensions should be added to files in addition to individually storing the creator/type in some OS X style fasion. He makes what appears to be a good arguement...For those who didn't read, Ill breifly outline.


    1. funny.txt.vbs emails where .vbs is hidden (as OS X.1 will offer) can trick the user into opening an application


    2. Hidden extensions allow Finder.app and Finder.whatever to appear as "Finder" in the same Folder....


    He then goes on to say why Apple would reccomend developers use extensions (which is redundant)...A networked world demands MacOS be a better "citizen". He claims extensions are unneccisary since email apps can append extensions to files when sent...Not to mention his speculation that Apple would drop it's current model from a Windows model...


    Problem with his analysis. E-mail isn't the only way to share files in OS X. Currently OS X offers FTP, HTTP, Appletalk, NFS, SSH and X.1 will add CIFS. Appletalk handles Meta information transparently, going from Mac to Mac, no need for extensions. FTP, HTTP, SSH and NFS (NFS will almost always go to a flat filesystem) offer no way to store/send OS X style meta information. Yes OS X treats a NFS drive (and CIFS drives if you use Sharity) as a UFS drive and stores Meta data properly so that the Mac can use the file, but the remote computer has NO idea what kinda of file that is, unless it has an extension. So a Mac user who casually copies a extension-less Word document to a PC zip disk, when they put that disk in a PC, it's useless (unless the user knows of the problem). So it is clear file extensions are needed for a networked enviroment....


    But Mac users don't like extensions so Apple will let us hide them.... which creates the problems he described (funny.txt.vbs and duplication file names in the same folder). The first is really a non problem since the Mail application in OS X doesn't hide file extensions even if it's named funny.txt.app and double clicking the in Mail does NOT launch the file. This potential problem can be further alleviated but noting what kind of file it is below it in Mail. The second issue of duplicate file names can be solved easily too...don't allow it. In other words DumbName.jpg and DumbName.txt should not be allowed in the same folder. Then hide all the file extensions and the users would be none the wiser.

    --

    Burn Hollywood Burn
    1. Re:File Extensions? by John+Siracusa · · Score: 1
      Problem with his analysis. E-mail isn't the only way to share files in OS X. Currently OS X offers FTP, HTTP, Appletalk, NFS, SSH and X.1 will add CIFS. Appletalk handles Meta information transparently, going from Mac to Mac, no need for extensions. FTP, HTTP, SSH and NFS (NFS will almost always go to a flat filesystem) offer no way to store/send OS X style meta information.

      As I posted to the Ars discussion:

      Yes, there are issues with other users connecting to a Mac via NFS or Samba and pulling off extension-less files. But there's no reason that a user in that situation couldn't opt to twiddle the system preference that tells every app to always append extensions (or use the per-app settings, if only some apps should do so, etc.) My main objection is to being forced to use them across the board, with no option not to.

    2. Re:File Extensions? by stripes · · Score: 2
      double clicking the in Mail does NOT launch the file

      Um, yes it does. I just mailed myself FontExamplar.app, and double clicking on it did run it (after telling me it might have a virus and stuff, then I clicked the "What's a virus, please bone me" button and it ran).

      And we know that under Mac OS X.0.4 Mail.app doesn't hide extensions, but I'm not sure that OS X.1's Mail.app won't. I would expect it to follow the finder setting. We also don't know what OS X.1 does with more then one "extension", does it strip them all? None? Or just one? I'm guessing just one, but I'm aware that it is a guess.

    3. Re:File Extensions? by Brand+X · · Score: 2

      The second issue of duplicate file names can be solved easily too...don't allow it. In other words DumbName.jpg and DumbName.txt should not be allowed in the same folder. Then hide all the file extensions and the users would be none the wiser.

      Oooh, yeah. Here goes me...

      Create file: BaseClass.cpp
      Create file: BaseClass.h
      That file already exists, choose another name.
      me: WTF?!

      I generally use a source and header directory (file) differentiation, but not when it's a quick and dirty proof of concept test...

      --
      -- Still waiting for the Nike endorsement
    4. Re:File Extensions? by jafac · · Score: 2

      that's not the worst thing about funny.txt.vbs.

      The WORST thing is that stupid shell-scrap garbage, a feature which nobody ever uses, and which HIDES extensions even if you've configured the OS to explicitly SHOW extensions so you don't get clobbered with this kind of thing. You assume it's a txt file, because you KNOW you told the OS to show you extensions - but not when it's a shell-scrap file, which was an obscure enough feature that even seasoned power users were unaware of it in their day-to-day use of the OS.

      As a viral engineering feat, funny.txt.vbs was genius.
      As an OS feature, shell-scrap, as far as I'm concerned can remove the s-es and become hell-crap.

      --

      These are my friends, See how they glisten. See this one shine, how he smiles in the light.
  16. Wow by sllort · · Score: 1

    Since Slashdot was down so long, I actually had a chance to read and understand the article before posting. Perhaps there should be a pause between article posting and allowing comments? Anyway, to get on topic:

    I disagree wholehartedly with the author's assessment that making the file type part of the name is a "bad thing". I disagree with his statement that the type of a file is immutable data. It is not. I have, many times, created a text file, written some html, and renamed it ".html" to load it in a web browser. Using a Mac has always been infuriating to me because I cannot easily change the application it is loaded with. It's changeable, sure, but not as easily as you can change to a simple, easily remembered mnemonic. Linux has echoed this paradigm for good reason. How hard is it to change a bash script to a different shell? Change the first line. On a Mac, this would require you to change an embedded 32 bit identifier.

    The argument is bogus. slashdot.pl and slashdot.txt should NOT collide on my desktop - the type IS part of the name. The mixing of file names & types was neither a hack nor a mistake. To those of us who use computers not as an information appliance but as information builders, the ability to easily manipulate file type data is a way of life.

    Thought provoking article, nonetheless.

    1. Re:Wow by Anonymous Coward · · Score: 0

      Sorry, but the fact that the file WAS HTML meant that its TYPE was HTML no matter what the metadata SAID it was. He's talking about what's REAL. An HTML file does not become a text file because you change its extension. You're thinking about IMPLEMENTATION.

    2. Re:Wow by Anonymous Coward · · Score: 0

      text & html are interchangeable. in fact, any ascii file can be both a text file and something more detailed. as he says, this is just a change in detail, but sometimes a change in detail changes the calling application.

      i don't like my operating system to get all uppity and think it knows more than me. it usually doesn't. but it DOES usually know more than your average mac user. that's why it works well for macs.

    3. Re:Wow by David+Roundy · · Score: 2, Insightful
      I disagree with his statement that the type of a file is immutable data. It is not. I have, many times, created a text file, written some html, and renamed it ".html" to load it in a web browser.

      I'm afraid you misunderstood his definition of immutable. In this example, you changed the data, and what was originally a plain text file became an HTML file. His definition of immutable was that if the file data changed, then its type did not change.

      Also, it didn't mean that the metadata need be unchangable, since it could be changed to reflect greater precision, or if it was wrong in the first place. For example, an html file is a text file (but more). So it is entirely reasonable to change the type from text to html (provided it actually is).

      slashdot.pl and slashdot.txt should NOT collide on my desktop...

      I agree that slashdot.pl and slashdot.txt should not collide, but that is just because they are part of the name. They should also not be required to be a given type.

      How hard is it to change a bash script to a different shell? Change the first line.

      I agree that metadata should be readily accessible. The only reason it is tough on a mac is because it was intended to be difficult, so that new users would have trouble shooting themselves in the foot.

      How would you like it if you had to name all your executable perl scripts ending with .pl? You don't, because the operating system specifies an (optional) header section to every executable file, which allows it to determine which program to run the file with. This is metadata, of the magic number variety. It is data added to the beginning of the file, for the sole purpose of determining its type (ok, in this case it also specifies the path to the perl executable and any flags to be passed it, but ignore that for a moment).

      The reason we have such magic numbers (which are also in most other standard file types, ps, gif, jpeg, etc) is because there are no common operating systems which support file types, so applications are on their own, and are forced to include what is properly (in my opinion) metadata in the file data itself. As long as we are going to store this data, why not have it in a standard location where it can be used by the rest of the operating system?

    4. Re:Wow by John+Siracusa · · Score: 1
      I disagree with his statement that the type of a file is immutable data. It is not. I have, many times, created a text file, written some html, and renamed it ".html" to load it in a web browser.

      You merely increased the accuracy of the file type metadata (HTML is a specific case of text), which was covered in the article.

      Using a Mac has always been infuriating to me because I cannot easily change the application it is loaded with. It's changeable, sure, but not as easily as you can change to a simple, easily remembered mnemonic.

      You're conflating the existence of file metadata with the application binding policy based on it--something warned against (multiple times) in the article.

      How hard is it to change a bash script to a different shell? Change the first line. On a Mac, this would require you to change an embedded 32 bit identifier.

      On the Mac, the file would likely be of the more general type "TEXT" in both cases. Speaking conceptually, of course you can change a file's type by changing its contents.

      slashdot.pl and slashdot.txt should NOT collide on my desktop

      In traditional file systems that use file name and location as the file identifier, I agree.

      the type IS part of the name.

      In the examples above, yes, it appears that the file type is encoded in the file names. But that doesn't have to be the case (nor should it be, IMO).

      To those of us who use computers not as an information appliance but as information builders, the ability to easily manipulate file type data is a way of life.

      There's nothing about file type metadata stored outside the file name that necessarily makes it any more difficult to change. Your thinking is constrained by existing implementations.

      That said, file type metadata should still never be changed unless it is to increase/decrease accuracy, or the file contents change (in which case the application changing the contents should set the new type metadata when saving the file). Actually changing (as opposed to improving or degrading the accuracy of) file type metadata without modifying the data itself is not useful.

    5. Re:Wow by Anonymous Coward · · Score: 0

      You're right - I hate that too about MacOS.

    6. Re:Wow by Anonymous Coward · · Score: 0

      How would you like it if you had to name all your executable perl scripts ending with .pl? You don't

      Yes I do. I run windows, which doesn't pull all that information hiding garbage that macs do. My operating system leaves the information out in the open where I can modify it. DUH!

  17. The UNIX system is equally idiotic by alexhmit01 · · Score: 4, Insightful

    The UNIX file-system is brilliant compared to DOS, but ONLY compared to DOS. It is still designed for command-line users convenience. I am NOT criticizing the command line, I use it daily under OpenBSD, Linux, WinNT4, Win2K, and Mac OS X. It is nice to have the control of a CLI, as well as the ability to run scripts.

    HOWEVER, the system of making things conveniently obvious for the CLI results in engineering decisions that give the OS less flexibilities. GUIs can provide TREMENDOUS ammounts of information BECAUSE the user decides when to get that information.

    For example, the filename and type need easy access for the user. For a GUI user, they need the filename and the type deciding the application binding. For a CLI user, including the type with the filename makes it easier to manipulate.

    While you could setup ls (or dir) with many flags to pick and choose the information, you create a minor mess. Additionally, things like changing the type to a list from a database is one thing for a GUI with a dropdown box, it's a nightmare to implement in a CLI. If you designed for the CLI, you made a tradeoff.

    Additionally, UNIX was developed in a hardware environment more restricted than the DOS world. Early machines used in development are nothing compared to modern machines.

    Take the NTFS file system. If you are on an NT4 machine, or a Win2K machine, (running NTFS of course, not braindead FAT/FAT32) you see filenames as normal. Inside the properties, there are MANY more options. Do it on a Win2K machine, and you see more information than on an NT4 machine if you look closely.

    The UNIX approach is old and dated. Microsoft has moved on, it's important for the UNIX community to do so as well. ACLs (implemented on NT) are FAR more flexible than users/groups. Private user groups are an ugly hack to handle the user/group system. The whole UNIX model needs to be modernized. There are ACL UNIX systems, but they aren't the mainstream.

    I love the power of UNIX-based server, they give me tremendous capabilities. A proper CLI is awesome. But let's not kid ourselves. Beating Win95/Win98/WinME at ANYTHING was never impressive, they were ugly hacks onto DOS that has its roots in the 8086 processor. Everytime people toute the advantages of Linux, they compare it to Win9x. Beating a legacy desktop OS in terms of uptime, etc., is NOT impressive. Compared to Win2K, Linux's technical advantages are pretty minor. There are some, but not many. Compared to the BSDs or commercial UNIXes... well, Linux doesn't look that impressive. It has advantages and drawbacks, different engineering decisions.

    The problem with UNIX is an LCD (lowest common denominator) and designed by committee problem. Having a common API that programmers can target is tremendous, it helps with portability. However, failing to keep moving that API foward is a mistake.

    As it stands there are many applications that only work on one variant. Extending the UNIX common API once or twice a year to encompass vendor extensions would be a tremendous boost, and allow UNIX to escape this trap. If Sun has a great idea and incorporates it into Solaris, their ISVs should take advantage of it. The rest of the UNIX world should have it within a year (or two at most) so ISVs can port to other UNIXes. As it stands, you either write to an old standard OR to a particular UNIX. Neither is a good choice.

    Alex

    1. Re:The UNIX system is equally idiotic by Zapman · · Score: 2

      You said:
      As it stands, you either write to an old standard OR to a particular UNIX. Neither is a good choice.

      This isn't nearly the case, if the code IS DONE RIGHT, it can be compiled on all of the common unicies and a large portion of the uncommon ones. I've been reading Kernel Traffic's GNU/HURD report quite a bit recently, and to paraphrase one of the major package porters:

      "Those packages that use autoconf/configure have been amazingly easy to port, usually needing a few lines of editing at most. Those that don't require enormous ammounts of effort."

      --
      Zapman
    2. Re:The UNIX system is equally idiotic by Anonymous Coward · · Score: 0

      How dare you question our beloved Unix command line? It is simple! It is powerful! It is the One True Way!

      Booooo! Booooo!

    3. Re:The UNIX system is equally idiotic by stripes · · Score: 2
      This isn't nearly the case, if the code IS DONE RIGHT [...] "Those packages that use autoconf/configure have been amazingly easy to port, usually needing a few lines of editing at most. Those that don't require enormous ammounts of effort."

      That doesn't really mean autoconf/configure is a magic bullet. If I write something that uses kqueue and want to port it to Linux (or Solaris) I have to write non-kqueue code. Autoconf will merely figure out which part of my code to enable or disable, it won't take my kqueue code and make it work elsewhere.

      So to use the new interfaces, and be portable I have to write code to use the old interfaces. If the old interface doesn't exist I have to disable part of my application's features.

      Using kqueue as an example (once again), if I have an X program using a toolkit that lets me write my own file I/O callbacks and timeouts, but no callback for wait4 or the like, but I want to know when a child process exits, I don't have that many good choices. I can use kqueue which pretty trivially converts a process exit into file I/O (or at least a read ready event, plus a call to kevent rather then read). Then it will only run on two or three Unix systems. I can write a SIGCHLD handler that sets a flag and use a periodic timeout, but then I either burn CPU, or it takes too long to see the event. I could skip the SIGCHLD handler and just call wait4 with W_NOHANG in the timer callback. That has roughly the same problems that the SIGCHLD answer does.

      I could write both, and then use autoconf to decide which to compile. Then I have to test both. The documentation has to say "On some platforms there can be a noticeable gap between the tracks, on others you get no gap in the tuneage".

      In other words it's not that doing the code right helps, it's doing the code twice, and it only helps so much.

      Last comment: yeah, the stuff that already uses autoconf ports easier then the rest of it because someone already did a lot of work to make it run multiple places, and may have decided to ditch features to avoid more problems.

  18. Infinite Meta-loop? by gnovos · · Score: 1

    I am not very knowledgeable about this kind of thing, so maybe I am just blowing smoke here, but don't you kind of fall into an infinite loop of metadata after a while? I mean, don't need to have to know things like, say, the size of metadata. Then you have to know the size of the meta-metadata? Then you have to know the size of the metameta-metadata? How do you get around that? (I'm sure there is a simple answer, but I am scratching my head.)

    --
    "Your superior intellect is no match for our puny weapons!"
    1. Re:Infinite Meta-loop? by Daniel+Dvorkin · · Score: 1

      Well, if as the author recommends we store metadata somewhere else on the disk, then the answer depends on the structure of the metadata file(s). One absurdly simple solution for the size problem is to decree that no files will ever be bigger than, say, 1 GB, and then always use (I think, off the top of my head) a thirty-bit number to encode file size. (Actually, thirty-two bits, i.e. four bytes, is a more likely choice in this situation. Eight bytes will get you file sizes considerably bigger than any hard disk we're likely to see any time in the near future ...) Alternately, since file size is intrinsic, you can calculate the size of file-size records on the fly as long as you use consistent end-of-record markers. There are a host of other solutions -- most of these aren't just issues for metadata, but for the design of all files.

      --
      The correlation between ignorance of statistics and using "correlation is not causation" as an argument is close to 1.
    2. Re:Infinite Meta-loop? by Anonymous Coward · · Score: 0

      Easy to fix:

      1) Meta meta meta data is fixed size and format, so the meta data about it is implicit somewhere and doesn't need to be stored. Usually this is done at the Meta-data level, no reason to nest too far down.

      2) No metadata, just header the file with the appropriate metadata. This is essentially what you get when you open an HTP socket or a pipe, the protocol used must be determined before hand or negotiated before the data starts to flow back and forth. It's also what most binary formats do for type, they just stick it at the top of the format in the first few bits. The ImageMagic miff image format is also an excellent example of this, all the image metadata is just tossed on top of the file as text, then the binary part is prepended to that.

  19. Apple's making the right decisions in a pinch by headless_ringmaster · · Score: 1

    Apple's got themselves in a situation where they're struggling to be accepted in the computing world. They're over criticized - partly because people care about them.

    Apple tries to appeal to the open-source community, and what happens - "Well, it's not GPL'd," or better yet: "Apple needs to open-source everything in OS X!" (and in this article there's ASCII art of masturbation... c'mon - grow up!)

    They try to appeal to their own customers - and get criticized for clearing inventory fast and coming out with a better, cheaper computer (when this actually happens to everyone). More on this note: "Classic sucks" - What's Apple supposed to do? In order to survive by holding onto customers /and/ developers, they've found a (at least mostly working) solution to make the transition.

    As for the metadata - Apple's trying to stick to the more effecient use of file-typing that they have, and seems to be Ars Technica's choice, while still making it easier for interoperability along the front of other OS's.

    Apple's succeeding in bringing more compatibility to their platform, and making it easier for developers to write for OS X - after all, without the programmers, Apple's f*d.

    Bottom line, let's acknowledge the fact that Apple's doing what it can to survive among OS's, and not badger them about the little things (for the momment) like "extentions look too much like windows," - quit your whining.

    --
    and they think I know what I'm doing....
    1. Re:Apple's making the right decisions in a pinch by Anonymous Coward · · Score: 0

      +* j a c k * o f f * j a c k * o f f * j a c k * o f f *+
      * \ |\ \ *
      j | / \ \ j
      a | | \ \ a
      c | | \ \ __ c
      k / \ \ \/__|__,,..---v--. k
      o | |__,,\.--"""\/ | \ o
      f | | \ _> f
      f| | _ _ _ _ | / f
      *| | /_v_v_v_\..---""'`-' *
      j| | __,,.| | | | | j
      a| / \ \_h_h_h_/ a
      c | | | c
      k | | | k
      o \ |\ | o
      f \ | \___/ f
      f \ | f
      * \ | *
      j | | j
      a | | a
      c | | c
      k | | k
      o | | o
      f | | f
      f | | f
      * | | *
      +* j a c k * o f f * j a c k * o f f * j a c k * o f f *+

  20. Poor technical expertise from a Mac Apologist by Anonymous Coward · · Score: 2, Interesting

    In a lot of ways this is a pretty good article, but there are a surprising number of instances when the author seems to have bent himself into thinking about a limited number of filesystems. Barring anything academic, experimental, or "fancy," it's pretty clear he's never tried to think about UNIX linked-list style filesystems: within the framework of his discussion, I would assert that a file's name is not part of its essential metadata in a UNIX-style FS. Why? All of the file information is contained in the file's inode and data blocks (the immediate decomposition into metadata and data being obvious). A name for the file is just an entry in a "directory," which is just another file. A given file might be listed in hundreds of different directories, nevertheless, there's still only one file. It might have a different name in each directory it's in; in which case it hardly makes sense to talk about "the filename," unless one is willing to assert that inode + data blocks don't constitute a file, and that each instance of a reference to a particular inode is to be considered a file.

    Furthermore, the examples of "immutable" metadata (ill-considered vocabulary in the first place, I think) are poorly considered. File size can be altered without altering the underlying data on BSD-style unices that provide truncation and extension system calls. Modification time often gets changed on many systems without any change to the underlying data: many, if not most, kernels will change mtime any time a file is opened for write or append even if no subsequent writes are done to the file. "File type" is essentially a nonsense notion on most UNIX filesystems (and DOS too, given the weak representation), a file's type being an interpretation an imaginary multipurpose file handler is to give the data. In such situations, "file type" is decided either by regexp matching of the file name (which can be anything, remember) and judicious use of magic (man magic if you don't get it). In many cases, this doesn't produce an unambiguous answer: 'file blah' produces 'blah: data' with amazing frequency. Arguably this just means that UNIX filesystems don't have an adequate mechanism to express the idea of "file type," but I would argue that regardless, the notion of file type is at least partially bogus. There's nothing to stop me from interpreting data many differnt ways: an XPM is something I can edit with an ordinary text editor, and hence a file of type "text," but it can also define pixmaps, so depending on what I want to do with it, it might be of at least two file types. Similarly, I can try to view a raw audio file as a compiled pixmap, or, to recapitulate the famous joke, 'cat /boot/vmlinuz > /dev/audio'. The results of such voluntary file polymorphism aren't always useful, but they sometimes are.

    There are further aspects of the article which are either incorrect, or at least fail to reflect my personal experience, but for the most part it's simply repetition of previous errors. It seems abundantly clear to me that the author is a thoughtful and well-educated person whose primary computing experience has been with Macs and post-DOS MS machines: and while he may have used UNIX-like operating systems, he doesn't know much about data representation of filesystems on them, and clearly hasn't considered more modern developments like filesystems with journaling or ACLs instead of permission bits.

    Perhaps my criticism is a little too sharp, I would like to emphasize that I liked much of the article and I laud the author for thinking about some important concepts in detail, but I feel the viewpoint adopted is one unnecessarily limited by the author's personal experience.

    1. Re:Poor technical expertise from a Mac Apologist by John+Siracusa · · Score: 2, Informative
      Barring anything academic, experimental, or "fancy," it's pretty clear he's never tried to think about UNIX linked-list style filesystems

      I assure you, that's not the case :)

      within the framework of his discussion, I would assert that a file's name is not part of its essential metadata in a UNIX-style FS. Why? All of the file information is contained in the file's inode and data blocks (the immediate decomposition into metadata and data being obvious). [...] unless one is willing to assert that inode + data blocks don't constitute a file, and that each instance of a reference to a particular inode is to be considered a file.

      But you can't get at the inode without the file's name and location. Inodes are not suitable as file identifiers since they are not guaranteed to be unique across the multiple disks that make up a given file system. The combination of the file name and location is unique in a given file system. "inode + data blocks" do constitute a file, but the file is inaccessible unless the file name and location are known. Therefore the file name is still essential metadata on a Unix-style file system.

      Furthermore, the examples of "immutable" metadata (ill-considered vocabulary in the first place, I think)...

      I considered "data-dependent", but stuck with immutable, for better or for worse.

      ...are poorly considered. File size can be altered without altering the underlying data on BSD-style unices that provide truncation and extension system calls.

      Truncation is a modification of the data.

      Modification time often gets changed on many systems without any change to the underlying data

      See my previous post on the topic. Yes, the semantics of modification date vary wildly. But there's no reason that the semantics I chose in the example in the fundamentals section (which tries to ignore existing implementations) couldn't exist.

      "File type" is essentially a nonsense notion on most UNIX filesystems

      I agree, which is one of the reasons I didn't address the Unix philosophy of reducing everything to a sequence of bytes or blocks at the OS level.

      the notion of file type is at least partially bogus. There's nothing to stop me from interpreting data many differnt ways: an XPM is something I can edit with an ordinary text editor, and hence a file of type "text," but it can also define pixmaps, so depending on what I want to do with it, it might be of at least two file types.

      What you want is a type hierarchy that indicates that XPM is of general type "text" and, more specifically, it is an X pixmap. There's nothing "bogus" about the notion of file type. I think you're unnecessarily constraining yourself to very simple metadata values.

      Similarly, I can try to view a raw audio file as a compiled pixmap, or, to recapitulate the famous joke, 'cat /boot/vmlinuz > /dev/audio'. The results of such voluntary file polymorphism aren't always useful, but they sometimes are.

      Storing file type metadata does not necessarily dictate any OS policies (if any) based on that metadata--something the article tries to point out many times.

      It seems abundantly clear to me that the author is a thoughtful and well-educated person whose primary computing experience has been with Macs and post-DOS MS machines: and while he may have used UNIX-like operating systems, he doesn't know much about data representation of filesystems on them

      I'm not so sure about "well educated." ;-) My primary computing experience is on the Mac and in Unix. I just chose not to address the Unix angle, for various reasons.

      and clearly hasn't considered more modern developments like filesystems with journaling or ACLs instead of permission bits.

      I've certainly "considered" them, and I did mention ACLs (although spelled out instead of by acronym: page 4) in the article. That's all just more, richer metadata.

    2. Re:Poor technical expertise from a Mac Apologist by stripes · · Score: 2
      But you can't get at the inode without the file's name and location. Inodes are not suitable as file identifiers since they are not guaranteed to be unique across the multiple disks that make up a given file system. The combination of the file name and location is unique in a given file system. "inode + data blocks" do constitute a file, but the file is inaccessible unless the file name and location are known.

      Actually there have been a number of (frequently ill-considered) non-standard ways to open a file by i-number. Sun's backup co-pilot was the first I had heard of (in '91), but it turns out there were a lot before it, and after. Most allowed only root to do it, but some did not. The ones that didn't broke some of the Unix security semantics.

      Also you can get to a file a few other ways without involving it's name. Like recvmsg, regrettably something else had to know a name to the file at one point for them to work (that name may be gone now though -- all of the names may be gone in fact).

    3. Re:Poor technical expertise from a Mac Apologist by Fred+Ferrigno · · Score: 2

      I considered "data-dependent", but stuck with immutable, for better or for worse.

      Immutable or data-dependent, they're both inaccurate when discussing file types. Unless you can have a definitive and assuredly correct description of what exactly is in that file, you're bound to be wrong on occasion. As well, I'm not entirely convinced that it's important to have a definitive description of a file's contents; frequently a user will open a file in an app that isn't designed to handle it, intentionally. An OS does well to make changing a file's type as easy as possible, something the MacOS has had trouble with in the past. (Downloading a freeware app for something that should be an OS function is hardly convenient, IMO.)

    4. Re:Poor technical expertise from a Mac Apologist by John+Siracusa · · Score: 1
      Immutable or data-dependent, they're both inaccurate when discussing file types. Unless you can have a definitive and assuredly correct description of what exactly is in that file, you're bound to be wrong on occasion.

      You can be wrong about a person's gender too. Does that make gender as mutable as a person's name?

      As well, I'm not entirely convinced that it's important to have a definitive description of a file's contents; frequently a user will open a file in an app that isn't designed to handle it, intentionally.

      Nothing about storing file type metadata prevents this. You're confusing metadata storage with OS policies that may be based on it.

    5. Re:Poor technical expertise from a Mac Apologist by Anonymous Coward · · Score: 0

      As long as you are here answering questions...

      1. Aren't you confusing metadata storage issues with app policies that may be based on them, when you bring up .txt.vbs files as an argument?

      2. If the file type is a short string that happens to be stored by the OS as part of the name, but the user never sees the type because it is hidden, then how exactly is that worse than the data being stored in a seperate filesystem structure? The only user-visible difference I can see is that the max filename length would be shortened by a few characters.

      3. MacOS X has a Unix base, and one of the goals was obviously compatibility with Unix software. Wouldn't this explain Apple's move towards file extensions better than the reasons you give in your paper?

    6. Re:Poor technical expertise from a Mac Apologist by John+Siracusa · · Score: 1
      Aren't you confusing metadata storage issues with app policies that may be based on them, when you bring up .txt.vbs files as an argument?

      No, I'm directly addressing a particular OS policy at that point. Storing file type metadata is good. Storing it encoded in the file name is not so good. Optionally hiding that part of the file name is a dangerous OS policy.

      If the file type is a short string that happens to be stored by the OS as part of the name, but the user never sees the type because it is hidden, then how exactly is that worse than the data being stored in a seperate filesystem structure? The only user-visible difference I can see is that the max filename length would be shortened by a few characters.

      Here are a few problems off the top of my head. There's the "5 files named 'foo'" problem in dirs with foo.c, foo.h, foo.gif, foo.txt, and foo.html. Then there's the fact that other views of the file system (say, FTP or web) will see different names than the local user does. Totally restricting access to part of the file name is too limiting, so the possibility of mucking up the file type during the seemingly unrelated task of editing the name also exists. And, of course, there's the virus-spreading problem mentioned earlier.

      MacOS X has a Unix base, and one of the goals was obviously compatibility with Unix software. Wouldn't this explain Apple's move towards file extensions better than the reasons you give in your paper?

      No, because Unix doesn't make any file type distinctionsat the OS level (beyond character/block/special, etc.), so nothing Apple chooses to do on the "Mac" side of things is likely to impact Unix tools and apps, which will continue to work the way they always have.

    7. Re:Poor technical expertise from a Mac Apologist by Anonymous Coward · · Score: 0

      No, because Unix doesn't make any file type distinctionsat the OS level (beyond character/block/special, etc.), so nothing Apple chooses to do on the "Mac" side of things is likely to impact Unix tools and apps, which will continue to work the way they always have.

      I'm wasn't trying to say that unix tools and apps depend on file types, but on filename extensions. Apple wants to use software like GCC which treats .c and .h and .cpp differently. Apple wants to use Apache which uses extensions to set outgoing MIME types. That sort of thing. I tend to think that Apple's decision to support filename extensions is a natural decision to make given OS X's Unix base.

    8. Re:Poor technical expertise from a Mac Apologist by John+Siracusa · · Score: 1
      I'm wasn't trying to say that unix tools and apps depend on file types, but on filename extensions. Apple wants to use software like GCC which treats .c and .h and .cpp differently. Apple wants to use Apache which uses extensions to set outgoing MIME types. That sort of thing. I tend to think that Apple's decision to support filename extensions is a natural decision to make given OS X's Unix base.

      Supporting file name extensions (i.e. parsing and understanding them) is desirable. Forcing them to be used even when they are not necessary (e.g. with more typical Mac user apps like Word or Photoshop) is not. Apple's policy does not even address Unix tools, which, of course, will continue to function as they always have. And even Apple's own developer tools and IDE will, of course, keep dealing with foo.h and blah.c. But "user-land" apps like word processors and such are also *required* to append file name extensions. That's the big issue, not what goes on in the Unix side of things or in the realm of application development.

  21. Sparse files in Unix by Anonymous Coward · · Score: 0

    What about them? File size is independent of file contents. Of course does anybody really use sparse files these days? If I recall they had problem across filesystems way back when. Some would file the gaps with 0's and others wouldn't.

  22. Not just a MAC/PC problem by Fjord · · Score: 1
    One of the problems I see with the importance of metadata is that many of the protocols we use can't transmit it. FTP has no way of sending metadata. HTTP can include some of it in the headers, but there are limits on the format for these headers.


    I think one of the reasons why extensions became so tightly tied to the mime-type is because of FTP. In the early days of the web, you could set up helper applications for mime-types, but if you were FTPing, you had to set it up as an extension. Now they are just linked.

    --
    -no broken link
  23. Great article by AndrewHowe · · Score: 2

    I liked this article, it was thought provoking. It reminded me of the Archimedes, with its 16 bit file types, and the Mac. Oh, the dear Mac. How many times did I scream at it, "yes you will bloody open that file!"... While it sits there all like, "No I bloody will not, it's the wrong type, I'm not even looking at it!"...
    Hang on a minute though. It's a bit much to have a go at Microsoft about file extensions. Unix? Written in C? .c and .h files? What would happen if you didn't have extensions?
    Anyway. Personally I get all excited by the idea of accessing files more as a database action. I know there are people that hate the idea.
    Interestingly, NTFS allows you to hang arbitrary stuff off a file. It's also a good way to hide stuff, because almost no-one knows about it. Oh, well.

    1. Re:Great article by Dr.+Scott · · Score: 1
      Hang on a minute though. It's a bit much to have a go at Microsoft about file extensions. Unix? Written in C? .c and .h files? What would happen if you didn't have extensions?

      Answer: not much. The C compiler is happy to compile files that don't end in ".c" if that's what you want. The preprocessor will include files that don't end in ".h". Those suffixes are for the convenience of the programmer. OK, make(1) depends on them, but then make is also for the convenience of the programmer.

  24. The problem with Just In Time file mapping by melatonin · · Score: 2, Insightful
    The thing that cheeses me is that the Internet is based on MIME. When I send a file "foobar" to a Windows user, and my email program tags the file as image/jpeg, the Windows email program should make the file name "foobar.jpg".

    A lot of things would be better if the 'lower' OSes would just pay attention to MIME types. But there's one obvious situation where it falls apart.

    Joe Mac User makes an HTML document referencing a bunch of JPEG and Flash images. The JPEG and Flash files don't have extensions in their names. He sends his HTML directory to his Windows-loving friend. Assuming that the Windows or Mac apps payed attention to the file types (either Just In Time on the Mac to add extensions, or the Windows app payed attention to MIME), the user's documents would have appropriate extensions added to them. The Windows user's HTML is busted.

    While it royally bites that I have to put up with extensions in OS X, I can understand why Apple did this.

    You non-tech-savvy computer user (I'd think that's 80% of computer users out there), are damn clueless, and would be completely unable to fix that HTML example.

    --
    Moderators should have to take a reading comprehension test.
    1. Re:The problem with Just In Time file mapping by kubrick · · Score: 1

      The thing that cheeses me is that the Internet is based on MIME. When I send a file "foobar" to a Windows user, and my email program tags the file as image/jpeg, the Windows email program should make the file name "foobar.jpg".

      But how does your email program know that it's a JPEG?

      Obviously we should adopt the approach of BeOS, and use MIME-type mapping and file attributes. Oh, except for all those people out there who use the wrong MIME types (or non-standard ones, anyway). :/

      The AmigaOS (and later BeOS) had Datatypes, an OS-level solution to the JIT file-typing problem... you could install input and output filters for any type of data you had datatypes for, and any program that supported datatypes (e.g. for images, or sounds, or text) could use these modules.

      Not that any of these approaches (either these I've described or any discussed so far in the responses) are the silver bullet, just my 2c worth...

      --
      deus does not exist but if he does
    2. Re:The problem with Just In Time file mapping by rpk · · Score: 1
      Joe Mac User makes an HTML document referencing a bunch of JPEG and Flash images. The JPEG and Flash files don't have extensions in their names. He sends his HTML directory to his Windows-loving friend.
      Yes, this collection of files will break a non-Mac-based HTTP server because they send file types based on extensions in the name. But strictly speaking the way particular HTTP servers work is an implementation detail and not a requirement of how the web is supposed to work.

      Mac Classic-based HTTP servers will translate from type codes (not extensions) to MIME types so when this bundle of files is served, any web browser on any platform will work with it, since it is not supposed to extract type information from URLs anyway.

      Does anybody know if Apple's mod_hfs does the same thing for Apache ?

    3. Re:The problem with Just In Time file mapping by melatonin · · Score: 1
      But how does your email program know that it's a JPEG?

      That was the whole point of the article :)

      The Mac encodes the file type as a 4 character code, 'JPEG'. So it's a JPEG file. The Be OS is no better or different as far as I know, except that it was designed after the MIME standard was introduced, so it's in sync with MIME.

      Hell, even Apple's ProDOS, which ran on the //e (81? 83?) had file types (creator codes too, I think.. it's been a while since I've done a catalog :)

      The AmigaOS (and later BeOS) had Datatypes, an OS-level solution to the JIT file-typing problem... you could install input and output filters for any type of data you had datatypes for, and any program that supported datatypes (e.g. for images, or sounds, or text) could use these modules.

      That's exactly what the Mac OS 8-9 has :P

      --
      Moderators should have to take a reading comprehension test.
    4. Re:The problem with Just In Time file mapping by kubrick · · Score: 1

      That was the whole point of the article :)

      The Mac encodes the file type as a 4 character code, 'JPEG'. So it's a JPEG file. The Be OS is no better or different as far as I know, except that it was designed after the MIME standard was introduced, so it's in sync with MIME.

      Hell, even Apple's ProDOS, which ran on the //e (81? 83?) had file types (creator codes too, I think.. it's been a while since I've done a catalog :)


      any solution is imperfect, unfortunately. i quite like one implemented in Directory Opus 5 (Amiga) and, I presume, 6 (Windows) -- a configurable list of different checks that could be applied, including filename matching, contents matching, byte searching, etc. The user could define their own filetypes, and what would be done to search for them. A bugger to maintain, but powerful and flexible :)

      (DOpus 5 and 6 function as OS shell replacements, acting as the primary GUI, so they can be considered to be in the same arena as the 'OS' there...)

      --
      deus does not exist but if he does
  25. File extensions are a hack... by alexhmit01 · · Score: 1

    If I had a file that was a text file: Grocery List, which is the easiest way to display it (including a command line)...

    groclist.txt
    Grocery List.txt
    Grocery List Text File Notepad

    In the first case (old DOS 8.3 convention) you need to remember what you called it or set up a naming convention. For a grocery list, this may not matter, but for files you want to access in 8-10 years, it does.

    The second is sort of clear, it is a Grocery List text file. However, you only know that it stores text.

    The third provides more information. You know what it was created in, as well as the type. If you honestly think that extensions are clear to users, I think that you are mistaken. Users see icons and click, looking like a text file is a good indication.

    Furthermore, nothing prevents YOU, the user, from using extensions, or dots, etc., to name you files.

    For example:
    Grocery List.txt Text File Notepad
    OR
    Grocery List.text Text File Notepad

    In either case, the computer ignores the extension, using the fact that it is a Text File created in Notepad. You however, have included that with the filename for your convenience.

    Alex

    1. Re:File extensions are a hack... by Mandrias · · Score: 1

      I don't like looking at the icons to tell what type the file is... especially since I like to put things into a list view so I can see a lot of items at once. The view of putting multiple columns of large icons for me to sort through and select is nothing but cumbersome for me. I'll take a long list of extension prone filenames over that in most cases. Not saying that there isn't a better way, just that at least *some* people do prefer to look at extensions today rather than icons.

      Icons are pretty... but far too often confusing.

      --
      Use the Z-modem protocol between Information Superhighway routers to compress the plaintext. ~LordOfYourPants
    2. Re:File extensions are a hack... by Anonymous Coward · · Score: 0

      But list view can have an extra column that gives you the file type. You still don't have to make it part of the name.

    3. Re:File extensions are a hack... by Com2Kid · · Score: 1

      And for those of us out there who skim through the file list (I actualy use the Detail view mode, much more convienent since I then just organize the files by size) by extension? It is quite nice to know what TYPE OF FILE I am dealing with just by looking at the end of the files name.

      For that matter it makes life alot easier for newbies too. I still have yet to figure out how to tell on MacOS what files are executables and what files are data files. Not to mention that when I wish to mess with Data files how am I to know what type of Data files they are? With file type extensions I can just look at the file name and say 'Hey, that is a .ini file, it is most likely in Plain Text so I can go in there and play around with it!'. Something that a lack of file type extensions would not allow for me to do just by scanning the directorie's file listing.

  26. Linux thoughts by iabervon · · Score: 4, Interesting

    Linux has traditionally not bothered very much with file type. The user generally knows what to do with the file, and does so. What look like extensions are actually just generically part of the filename; there are conventions for them, but they are no more strict than the conventions for filenames in general (Makefile is probably a makefile, README is plain text, foo.c is C source, etc.).

    An important thing to realize is that file type, like, for instance, size, can be determined from looking at the data. In fact, many programs look at data files and determine the file format from the data; "file" does a pretty good job of detecting non-human-readable formats, even without knowing any information at all about the file type.

    Where this all breaks down, of course, is when the user wants to omit the program name. On a Mac, you normally double-click on a data file to open it (and hope to get a program that does what you want). On *nix, you traditionally have to specify the program-- and much of the time, you select a different program depending on the desired result: for foo.c, I could use emacs, or gcc, or I might want gcc -M (get dependencies), or even wc (to see how big it is), not to mention less or grep or etags.

    I think part of the Mac fascination with file type is due to the monolithic program structure; you find the file, and then you open a single program that does to it anything that you will ever do to it. In this model, there is a right program, and which program is right is based on file type. Windows clearly suffers greatly from having this model but not having a more reliable fashion of determining file type than Linux.

    Incidentally, has anyone else noticed that the MacOS scheme is equivalent to having 4 character extensions which aren't displayed, with the corresponding problem of having malicious executables named README.txt (or even README)?

    1. Re:Linux thoughts by Anonymous Coward · · Score: 0

      except that with the Classic MacOS scheme, you would see the icon of the program that the file belongs to. So a word document would have a word document icon and so on. If the parent application is not available and the file mapping is not present, then it would ask how to open the file. If it does have the application, then you would see that this "Readme.txt.vbs" file is being opened in reality by that a vbs program. It is a different scenario. Now go back andd fix that statement.

    2. Re:Linux thoughts by TWR · · Score: 3, Informative
      I think part of the Mac fascination with file type is due to the monolithic program structure; you find the file, and then you open a single program that does to it anything that you will ever do to it. In this model, there is a right program, and which program is right is based on file type. Windows clearly suffers greatly from having this model but not having a more reliable fashion of determining file type than Linux.


      You clearly don't understand the type and creator fields.


      There are TWO separate fields for each file in the classic Mac OS. One (TYPE) indicates what kind of file it is. The other (CREATOR) indicates what program will open the file by default. Each is four bytes long.


      The nice thing about this system is that you get a clean separation between file typing AND default launching application. It's other OSes which have the "monolithic" structure you're talking about.



      Incidentally, has anyone else noticed that the MacOS scheme is equivalent to having 4 character extensions which aren't displayed, with the corresponding problem of having malicious executables named README.txt (or even README)?


      First of all, it'd be an 8 character extension. Secondly, List view on a Mac shows file type by default; an application is listed as "application program". Granted icon view won't discriminate unless you do a get info or sort by kind. Finally, if you don't trust the source of a file, don't open the file. This is common sense, no matter what extensions you are showing or whatever file system you are using.


      -jon

      --

      Remember Amalek.

    3. Re:Linux thoughts by SeanAhern · · Score: 1
      Linux has traditionally not bothered very much with file type. The user generally knows what to do with the file, and does so. What look like extensions are actually just generically part of the filename; there are conventions for them, but they are no more strict than the conventions for filenames in general (Makefile is probably a makefile, README is plain text, foo.c is C source, etc.).

      For the most part, you're right. But there are plenty of cases where this is violated. Every try to link against a library that wasn't named *.a or *.so? Can't easily do it. Ever try to get gcc to compile something with a weird extension or no extension at all? It won't do it. Linkers and compilers are pretty fundamental parts of UNIX, and they don't play well with arbitrary file names.

    4. Re:Linux thoughts by uid8472 · · Score: 1

      gcc has the -x option to specify the language of a file where it can't figure it out from the name.

    5. Re:Linux thoughts by melatonin · · Score: 1
      Incidentally, has anyone else noticed that the MacOS scheme is equivalent to having 4 character extensions which aren't displayed, with the corresponding problem of having malicious executables named README.txt (or even README)?

      Yes, and there are apps that sort of masquerade as files, particularly READMEs. But what difference does it make? If there's one thing I've learned as a software developer, users do NOT read Readmes.

      If you're in list view the Mac will tell you it's an application. But fortunately, it's harder to write malicious code on a Mac than it is to do, say, #!/bin/sh, rm -rf /. You'd have to look up the file system API in a reference book, which reduces the trojan-writing to be uncool.

      --
      Moderators should have to take a reading comprehension test.
    6. Re:Linux thoughts by GypC · · Score: 2

      Uhh... rm -rf / is not going to work unless the administrator is stupid enough to run untrusted executables as root.

      rm -rf $HOME can be devastating if you don't back up your data, but it's hardly going to kill the system. I usually have a cron job back up the home directories to a seperate partition on which the users have no write privileges (unless, of course, there is a tape drive available.)

    7. Re:Linux thoughts by Tchaik · · Score: 1

      > Incidentally, has anyone else noticed that the MacOS scheme is equivalent to having 4 character extensions which aren't displayed

      It isn't; there are 4 characters for the type _and_ 4 characters for the creator. That creator is important since double clicking on a text file will open it in Word / BBEdit / etc... if that's the creator. If the creator is not recognized, the OS will ask you to choose between applications that can deal with that type of files.

    8. Re:Linux thoughts by august · · Score: 1

      Incidentally, has anyone else noticed that the MacOS scheme is equivalent to having 4 character extensions which aren't displayed, with the corresponding problem of having malicious executables named README.txt (or even README)?

      When did this become a MacOS problem? What's the difference between that and my mom downloading a linux binary on her shiny Redhat install that she thinks is going to do something... uh.. neat?

      And I won't even mention windows' problems.

    9. Re:Linux thoughts by Anonymous Coward · · Score: 0

      Linux fucking sucks.. that's why this article isn't about it. Its dying; get over it.

    10. Re:Linux thoughts by Staff333 · · Score: 1

      The nice thing about this system is that you get a clean separation between file typing AND default launching application. It's other OSes which have the "monolithic" structure you're talking about.

      I think what iabervon is saying is that on a mac you usually just select a file and allow a specific program associated with the type, or the original program to open it. It's monolithic in the sense that the application used is tied to the file, whereas in linux (at least the traditional CLI) you would very frequently use a large variety of programs to operate on the same file.

    11. Re:Linux thoughts by mj6798 · · Score: 1

      The Mac scheme is great for a desktop system. It makes no sense for most of the applications that UNIX and Linux have been used for traditionally. Quite to the contrary: the Mac "forks" and "types" cause constant headaches for such uses.

    12. Re:Linux thoughts by DickBreath · · Score: 4, Insightful

      You don't understand the difference between TYPE and CREATOR. Imagine the following.

      dickbreath@toybox:~/dudes > ls -la
      total 31337
      -rw------- dickbreath users TEXT NPAD file1.txt
      -rw------- dickbreath users TEXT NPAD file2.txt
      -rw------- dickbreath users TEXT WORD file3.txt
      -rw------- john yum JPEG WORD file4.txt
      -rw------- sean yum JPEG GIMP file5.txt
      dickbreath@toybox:~/dudes >

      There are 5 files. Several of them have been MIS-named! Notice that "ls" has been cleverly modified to indicate the file TYPE and CREATOR metadata.

      file1.txt is a text file. (type TEXT) When you doubleclick it, it will open in Notepad. (creator NPAD)

      file3.txt is also text. (type TEXT) But when you double click it, it will open in -- surprise! -- Word!

      file4.txt is not text at all (type JPEG) although the filename might decieve some into thinking it was a text file. But when you've NEVER had to use this stupid ".txt" naming suffix thing, you wouldn't be decieved. In fact, you would wonder why on God's green earth whyone would put ".txt" on the end of a filename? The icon wuold clearly show it is jpeg, belonging to word.

      file5.txt is also not text (type JPEG), but surprise, it opens in a *different* application, this time, the GIMP! (Note type is JPEG, creator is GIMP)

      Finally, the icon displayed for a file is determined by the application. Each application has a database of icons to assign. The icon displayed is determined by the unique COMBINATION of type and creator.

      For instance, if GIMP can open JPEG, GIF, and PSD, then you might have a "family" of similarly styled gimp icons, yet each icon is visually distinct enough to make clear that the file is jpeg, gif, or psd. But another app, such as ImageView, might also have it's own uniquely styled family of similar looking icons, but have "jpeg", "gif", and "psd" variations of those icons.

      When a file is GIF/ImageView, it gets the "gif" icon from the ImageView application. When a file is GIF/GIMP, it gets the "gif" icon from the GIMP application. The icon visually distinguishes what kind of data it is, and what application is going to open it.

      But you can always grab a GIF/ImageView, file and drag-drop it onto GIMP. No sweat. In fact, if you then save the document from GIMP, the creator will be changed -- but type will still be GIF.

      I apologize, if I come off as frustrated that such an advanced concept, invented such a long time ago, is still so relatively unknown by so many people who are so technically brilliant. And a lot of it is entrenched thinking. "Well, this is how we've always done it!" We laugh at MS for lack of innovation, yet I hear many here talk about not liking GUI's despite their now finally commonly accepted advantages, yet some of us stay stuck in the stone ages when it comes to how unix has always done things.

      Finally, other posters under this topic have complained about how hard it is to change the filetype compared to the filename. Really? They type "mv" to change Finally, other posters under this topic have complained about how hard it is to change the filetype compared to the filename. Really? They type "mv" to change the name, and "chown" and "chmod", but they can't change the filetype or creator? You have to (in KDE) right click, Properties to change the filename. Would it be so hard in the same dialog to edit the type and creator as well as the filename?

      I bet the same programming genius who could modify "ls" to display the filesystem's type/creator could also write new "chtype" and "chcrtr" commands.
      the name, and "chown" and "chmod", but they can't change the filetype or creator? You have to (in KDE) right click, Properties to change the filename. Would it be so hard in the same dialog to edit the type and creator as well as the filename?

      I bet the same programming genius who could modify "ls" to display the filesystem's type/creator could also write new "chtype" and "chcrtr" commands.

      --

      I'll see your senator, and I'll raise you two judges.
    13. Re:Linux thoughts by melatonin · · Score: 1
      Uhh... rm -rf / is not going to work

      Well duh, but it's still going to f*ck the user and anything else they might have access too (say their Windows partition).

      Trying to achieve total wipeout wasn't my point :)

      --
      Moderators should have to take a reading comprehension test.
    14. Re:Linux thoughts by tbo · · Score: 1

      Incidentally, has anyone else noticed that the MacOS scheme is equivalent to having 4 character extensions which aren't displayed, with the corresponding problem of having malicious executables named README.txt (or even README)?

      Yes, I've noticed that. You'd also have to create a custom icon for the file, because the OS would otherwise associate the "vbs" (or whatever) icon with the file--a dead giveaway that it wasn't a text file. Of course, custom icons don't tend to survive being emailed, so it's *reasonably* safe.

    15. Re:Linux thoughts by GypC · · Score: 2

      I wasn't sure if you knew anything about Unix or not. I didn't mean to condescend.

      No offense, but users should have read-only access to a FAT filesystem as it implements no security.

    16. Re:Linux thoughts by Khelder · · Score: 1
      I think iabervon's comment had nothing to do with the difference between file type and creator. Either or both of those can be used to determine what application to use on a file. I think his main point was right on: How useful file metadata is to you depends on how you use your files.

      As iabervon said, CLI (e.g., Unix) users normally specify the program to use on a file explicitly. Metadata is not required. If you always use the same application on the same file types, this is not efficient for the user, but if you use many different applications on the same file, then it isn't a big deal (and may even be more efficient than the GUI strategy).

      OTOH, GUI (e.g., Mac) users most often double-click on a file to operate on it, and let the system choose the right application to use. In this case, the system needs metadata to figure out the right application (e.g., type, creator, or some combination of the two).

      [from iabervon] Incidentally, has anyone else noticed that the MacOS scheme is equivalent to having 4 character extensions which aren't displayed, with the corresponding problem of having malicious executables named README.txt (or even README)?

      I'm not sure about this since I rarely use Macs, but doesn't the Mac show different icons depending on the file type? A Mac user would not mistake a non-text file for a text file since it wouldn't have the "text file" icon.

    17. Re:Linux thoughts by iabervon · · Score: 2

      Actually, it's no big deal linking against a weirdly-named library; you just have to specify the exact filename instead of -lfoo. The shortcut
      isn't about the extension, either: you'll have just as much of a problem with files that don't start with "lib". The linker doesn't actually care at all what the filename is; it's just that it provides a quick way of writing some filenames.

      The compiler, is, in fact, an exception. Ideally, perhaps, it would try to identify the file type by looking at the file; in any case, you can tell it what you mean if it isn't guessing correctly from the name.

    18. Re:Linux thoughts by iabervon · · Score: 2

      The difference is that what the machine does is what your mother tells it to do, not what the file tells it to do. If she downloads a file that she thinks is text and tries to read it, and it's a binary, she gets a screenful of garbage. If she's intentionally running programs, that's one thing; if she's running a program when she means to be just looking at something, that's a problem.

      Windows has all of the problems that MacOS does, except worse, because it's badly implemented.

    19. Re:Linux thoughts by iabervon · · Score: 3, Interesting

      My point was that, under Linux, "creator" isn't very useful. Most of my files are created by "emacs" (or "cp" or "sed" or something, when I copy a template), but what I normally do with them is compile them. The filename extension only matters a bit (saves having to tell the compiler what language it is explicitly); having a type code would have the same effect, but having the creator wouldn't help at all.

      MacOS and Windows are designed such that you tend to use the same program to deal with a given file, no matter what you're doing with it. If you have a JPEG you made with the GIMP, you'll view it in the GIMP. If you're going to compile a source file, you edit it in the compile's IDE, and you view it in the IDE. *nix is designed such that you use different programs for different operations (edit, view, compile, render, etc), and use the same program for a given operation for a number of different file types (C, HTML, English text, etc).

      Of course, Windows gets the worst of both worlds-- you have monolithic applications which do everything to a given program, but you don't have the creator metadata, so it picks a program badly.

      I think it would be nice to have file types under Linux; currently, there are a number of partial solutions: emacs has an "Edit this file in -mode" directive, most binary types have magic numbers (e.g., GIF89a, , ^?ELF, JFIF, etc), and some programs look at some extensions. Of course, there would have to be a number of different types associated with a given file (Java source, UTF-8, plain text, etc), and it would have to be simple to specify the type of a new file when you create it, which is currently done partially by naming it in accordance with a convention and partially by putting in data which looks like a certain type, both of which you'd want to do anyway.

    20. Re:Linux thoughts by Brian+Knotts · · Score: 2
      One solution is what OS/2 did. It had EAs in the filesystem, and there were various actions one could do on a particular object.

      On a .c file, for instance, you would have a "C Source Code" filetype defined in the EA, and a right click in the GUI would display a number of possible actions, each of which may well invoke a different program.

    21. Re:Linux thoughts by Anonymous Coward · · Score: 0

      > and which program is right is based on file type

      no! nonono! that's *not* how it works! the program is determined based on *creator* type!

    22. Re:Linux thoughts by castlan · · Score: 1

      I think you meant to say that with a command line interface, like BASH for instance, a creator type isn't useful. That is because there is not an equivalent to "double clicking" a document, so there is no infrastructure in place for automatically calling an external default (not necessarily) application to execute the document. That does not mean that a "creator" field can not be useful under "Linux", just not under current command line implementations.

      Under a GUI, "creator" like information can be very useful if properly utilized. It shouldn't be called a creator type though, that implies that the creating application should be the default application launched. This is a shortcoming of the MacOS implementation, especially considering Apple's emphasis on content creation. The tools for HTML generation and photo editing aren't well suited just for browsing same media. such problems have been solved on other systems.

      While it is an understandable foible to add misguided reverence to a command line artifact along with the command line itself, it is highly shameful considering Apple's record of innovation in interface semantics. Should the cyclops put out one eye to better fit in with the blind? Why destroy a superior, if still insufficient file typing mechanism, for a clearly inferior solution? Because it's good enough for Windows?

      Three letters cannot hold enough information to represent all of the distinct file types that are necessary in modern computing systems. Even Apple's four character types permuted with four character creators weren't enough, the limited space was running out. It is appalling that they exacerbated the situation by severing the superior system they had in place, instead of adopting or creating a superior system. BeOS adopted MIME types for use in their system, after determining that Apple style Type + Creator codes weren't sufficient. Why did Apple go backwards? Now the seemingly defunct BeOS has a better GUI than most curent systems, with the "creator type" situation nicely solved. Instead of having a distinct creator type, the MIME derived file types all have default applications associated with them. Thus, double clicking will usually open the desired program, and right clicking will show a list of all available applications that know how to open that type of file.

      Thankfully, this one point is probably moot, as at least GNOME currently maintains (BeOS) Style MIME types. Such functionality is indeed most useful in a GUI environment, whether on top of Linux or any other Kernel. What would be really revealing is some benchmarks determining efficiency of MIME typing versus automatic type determiniation with magic numbers and file headers. I suspect that even if performance measures weren't clear, there would still be an advantage to having an up-to-date database of file types contained on the system.

      Now if Gnome would actually make better use of them, I could finally start complaining about the other GUI enhancements Linux supported GUIs should steal from Be (If Apple doesn't smarten up first and steal them back :)

  27. Ex: Amiga .info files. by Fly · · Score: 1
    The AmigaOS did this, too. Any file could have an associated .info file that could specify file metadata such as an application to open the file, the file's icon, etc. The Workbench GUI would then display all files (with OS2.0 and above) except for the .info files, which could still easily be accessed from the command line, or modified using the GUI to view and edit the .info file.

    It worked pretty well, IMO. Something similar could be done for Linux and would not need to be limited to providing metadata for the GUI. Applications would have to individually start supporting this and avoid trampling each other's metadata.

    end of line

    --
    end of line
  28. Okay, so metadata is good for applications... by BadTiggy · · Score: 1

    But what about users? I mean most applications are smart enough nowadays to ignore file types and simply test file formats. If I feed a text file with no extension into notepad, it doesn't reject it cause there's no .txt extension on it, and even image viewers can determine the type of image without an extension (or the incorrect extension to boot!)

    I think file name extensions are more important to the user than to the applications really. If I don't need a file extension, then I don't put one. If I do, then I do. Using windows with the file types hidden drives me nutty, since I'm not heavy into GUI, the icons are meaningless to me and not shown in all applications.

    Example: Programming. In a project directory there can be multiple object, source, and header files, often with similar names. The extensions serve to show ME which is which, not to tell (insert favorite text editor here) that it's trying to open an object file, C source code, or a C header file. I can make extensions that tell me which files are what just at a glance. I can feed GCC any filename I want as long as I tell it what it is.

    Anyhow, just my two cents.

    --
    "If I blow your mind, you have to promise not to think in my mouth."
  29. Re:BeOS has already SOLVED the FileType/Metadata p by XBL · · Score: 1

    Yep :-)

    BTW, what's going on with the BeNews server?

  30. You have missed the point by Anonymous Coward · · Score: 0

    The author covers your foo.txt --> foo.html example when he describes how type information can be narrowed. He uses the example of the broader type of gif going to the more specific gif89a; your example is going from a text file to a specific kind of text file. The type of the file never changed: it was an html file when you created it, it was an html file when you renamed it.

    You also seem to be confused about the Mac's creator codes, which simply don't exist on other operating systems. In addition to tracking the type of a file, the Mac also tracks (as meta-data) the application that created it. Does this mean a user cannot create a gif file with one application and edit it with another? Of course not. What it means is that when you double-click on a file, the application that created it will open - which is usually what the user wants. If you want to open it with another application, you open that application and then open the file.
    You are right that the older Mac OS made it more difficult than it needed to be to change type and creator. But there were literally dozens of applications that filled that gap.
    You are right that type is part of the name - but you seem to have missed one of the author's key points: it doesn't have to be, and probably should not.

    1. Re:You have missed the point by Anonymous Coward · · Score: 0

      wrongo. How do you know a text file is a text file?

      Windows: It says ".txt"

      Mac: It has an icon that people who've been using macs for a long time know means "text"

      The type is far less evident at first glance on a Mac. It's a huge pain. What do you do with a foreign file called "DontRunMe" and an icon that looks like a bird? Is it an executable? Is it a virus? What application will launch when I click on it? Is it an application itself? Is it a shell script? I HAVE NO IDEA (because my operating system thinks it's smarter than me).

      Instant deterministic type recognition is a FEATURE. Windows has it. Macintosh doesn't. This makes for a less secure, harder to use operating system for people who understand how to use a computer.

      Idiots, on the other hand, love Macs. As well they should.

    2. Re:You have missed the point by adjusting · · Score: 1

      >What it means is that when you double-click on a file, the application that created it will open - which is usually what the user wants.

      This is almost never what I want. When I create an html file with golive, I want to open it with my web browser. When I create a gif file with Graphic Converter, again I want to open it with my web browser.
      The worst part about this situation is not the incorrect assumption, it's that there's no built in way to change which application opens a file.
      Thank god for MacOS X.

    3. Re:You have missed the point by itachi · · Score: 1

      Well, you start terminal.app, and you use the file command to get info on the file. Man 1 file for more details. Or, through the gui, file info will tell you the file creator, creation date, last modification date, file type, creation app, etc. Command-F, I think. If you know how to use a computer, you can figure these things out. Of course, if a Mac is too challenging for you, I have a used etch-a-sketch... IHBT, IHL, I'll HAND.

      itachi

    4. Re:You have missed the point by Anonymous Coward · · Score: 0

      Or, through the gui, file info will tell you the file creator, creation date, last modification date, file type, creation app, etc. Command-F, I think

      And this is istant deterministic recognition how? If it doesn't work on the desktop, or in folders, instantly, it's not instant. Jesus. Allow me to repeat myself:

      Instant deterministic type recognition is a FEATURE. Windows has it. Macintosh doesn't.

      God, what is it with these mac retards?

    5. Re:You have missed the point by Seth+Milliken · · Score: 1

      Um...yeah. And what is it with these people who criticize something with which they have obviously had no significant experience?

      The Finder can be configured to display the "Kind" column in list view, which more or less provides a textual description of the associated application and file type.

      (FWIW, the information panel is brought up with Command-I; Command-F launches Sherlock, the find file application.)

    6. Re:You have missed the point by Com2Kid · · Score: 1

      >The Finder can be configured to display >the "Kind" column in list view, which more or >less provides a textual description of the >associated application and file type.

      Or on a PC I can just, err, well, look at the bleeping file. Simple. I see the file, I see the information, not questions, no changes, it is there. Badda boom badda bing, tada, poof, tis magic or something, whatever. The information is avilable and given to me /up front/.

      I paid for my computer, my computer didn't pay for me. I am too use it, it is not to use me.

    7. Re:You have missed the point by Gorimek · · Score: 2

      wrongo. How do you know a text file is a text file?

      Windows: It says ".txt"

      Mac: It has an icon that people who've been using macs for a long time know means "text"


      Or to use the same standard for both systems:

      Windows: It's name ends in ".txt", which people who've been using PCs for a long time know means "text". Unless they're watching the file name with the extension hidden, which is often (but not always) the default setting

      And of course .txt is pretty unique in that many people can guess that it probably means "text". Most extensions are not as clear, you just have to know them all by heart to be able to use them.

    8. Re:You have missed the point by itachi · · Score: 1

      Really? On my win2k workstation, if I look at the file I get no more information than if I just look at a file on a Mac. Now I can do the same thing with win2k that was suggested with MacOS, namely, change the default view preferences, but I don't see how this is a fault of the Mac and a benefit of Windows. As for *nixes, other than metadata possibly contained in the name of the file and execute permissions, there is no immediately visible source of file type information at all. How does a white rectangle and a yellow scroll in it tell me anything about a javascript file? If I already know the association between the file type and the icon, I'm fine, but I'm not seeing any other immediately visible source of metadata there. How will that tell me permissions, or creation date? If I want that info, I'll have to do the same thing that I'd have to do with the Mac - ask for the info. Stop trolling, sparky.

      itachi

    9. Re:You have missed the point by Anonymous Coward · · Score: 0

      The Finder can be configured to display the "Kind" column in list view, which more or less provides a textual description of the associated application and file type.

      Command-I, Command-F, Sherlock, and the Finder are not instant deterministic type recognition. You must be able to instantly determine the type in all contexts to have this feature. Please read the post before responding. Otherwise you just look like an idiot. Thx.

      Instant deterministic type recognition is a FEATURE. Windows has it. Macintosh doesn't.

    10. Re:You have missed the point by Anonymous Coward · · Score: 0

      >Windows: It says ".txt"
      >Mac: It has an icon that people who've
      >been using macs for a long time know
      >means "text"

      nothing keeps you from naming your file "foo.txt" on a mac. personally, I see the icon before I read the name, therefore I think a different icon is more useful than a different name, but that's just my preference, and the mac doesn't force me to work that way. it just allows me to if I want to.

      >The type is far less evident at first
      >glance on a Mac. It's a huge pain. What
      >do you do with a foreign file called
      >"DontRunMe" and an icon that looks
      >like a bird? Is it an executable?
      >Is it a virus?

      uhm... things like this are even easier to do on Windows, but if you really get a file like that, click on it and press command-i to get to the file info window. it will tell you what kind of file it is and what program will open it if you double-click it.

    11. Re:You have missed the point by Com2Kid · · Score: 1

      On my win2k workstation, if I look at the file I get no more information than if I just look at a file on a Mac.

      Icons are standardized for file types, nothing new there though.

      The file name itself, 88888888.333. Well, at least for those of us who are WAY to used to naming things according to that scheme (hmm. . . . heh).

      The file extension shalt reveal all.

  31. If a MIME gets built into a filesystem... by Soong · · Score: 1

    Suppose ext4fs (some time in the future) has built in MIME types associated with every file, and an optional XML metadata piece too.

    Apache gets mod'd to use the MIME types built into the files when possbile instead of using other magik.

    GNOME/KDE/whatever access a database associating MIME/XML with applications.

    So far, we have a great, at-least-as-good-as-Mac user experience, within the machine. Files moved by HTTP will retain the basic type. The XML could be available to those who want it by other extensions to protocol.

    But, lots of things need updating. How do you "rm *.html" when things are "text/html" behind the scences? Shells need updating. New C library calls. Much of unix may need some rethought. Perl will need extensions (been done, see MacPerl).

    Like the article said, FTP and Mail need updating to talk to foriegners and translate metadata to their system.

    Apple may still have a chance to not botch it too. I'd like to see that.

    --
    Start Running Better Polls
  32. I HATE the MacOS and its stupid metadata! HATE it! by Chasing+Amy · · Score: 3, Insightful

    Actually, I lied. I don't hate MacOS; I just wanted to get your attention by yelling about it. Now that you're here, though, I have to say that I LOVE the MacOS, and have ever since I first used it, before it was even called MacOS. I started with System 7, which was so attractive and easy to use that it's still my bar for measuring other interfaces.

    But if there is one thing I intensely dislike about MacOS, it's the metadata. I know I'm practically alone in the Mac camp, but I hate metadata. I have always thought it was just a space-hogging pain in my ass.

    Now, the space issue is no longer a big concern since we have such big, cheap drives that a little filesystem metadata isn't such a burden on capacity. But back in the days of floppies I was pissed that I could fit so few files on a floppy when my friend with DOS could fit noticeably more. I was especially annoyed that even when I formatted a disk as a PC floppy, the Mac would still waste my space by creating and hiding from me files and folders on the disk to constiture the resource forks. I wanted every kilobyte, which counts when you're cramming a lot of small files onto a lot of small disks.

    But of course this is no longer the big issue it used to be. But if I were storing large numbers of files and running out of space on a Mac, I'd still silently curse all that metadata wasting my capacity.

    The part that still bothers me, now that capacity is no longer a substantial issue, is that in Windows or *nix I can instantly change file types from the interface, but not with Mac. It comes up a lot--many times a day. Click a filename, change three letters, and a text file is recognized as a script or batch file to be executed rather than opened. A click and three letters, and a file I just downloaded from USENET goes from text to UUencoded so that when I double-click it will be decoded for me. A click and three letters is all it takes to change a file's type and its application association from the GUI, without having to resort to some clunky special editor. And it's even better if I need to change the type/association of a great number of files--just open a CLI and type a quick line, and it's all done. What a pain it would be to have to use a metadata editor instead of just manipulating three letters in filenames. Simple file extensions put more power over the file within easy, simple, even automatable reach.

    The advantage of metadata is something many Mac users, and theoretists like this article's author, seem to believe in, but I cannot see it. For instance, it's thought a great advantage that you can set a file to open with any application, despite the filetype. I hate downloading things on a Mac because of this. Some idjit will have a file set to open in an application I don't have, and the computer may be too stupid to know that I always open that file type in Application X. A dialog pops up on any reasonably modern MacOS to help, but it's still a big pain in the ass compared to having a PC automatically know what I open that file type with. Even more annoying is when I really do have the application the file is set to open with installed, but I always want that file type to open in a different app. This most often happens with graphics files--I do not under any circumstances want to have Photoshop or Graphic Converter open a graphics file, just because that's what it was created in. I have a simple image viewer for viewing images. If I want to edit them, *then* I open them in Photoshop. Same for Premiere and others--I do not want a big, slow editor to open my files just because that's what they were created with; we have smaller-footprint and more versatile file viewers for that.

    The other part of it is that the "simplistic" (sometimes the most simple designs are the most elegant, while the more complex are just gaudy) file typing systems also solve the problem of opening certain files of a given type in one application but others of the same type in another application. Metadata proponents always point out how "great" it is to have one, for example, JPEG open in JPEGview or whatever, while another JPEG opens in Photoshop; one .wav opens in a player, while another opens in an editor or burner. Well, I think the solution offered by Windows and by some *nix environments is better, easier, simpler, more elegant. A simple context menu, brought up by right or center-clicking, provides any options you could want. That way to open something in my viewer application, I just double-click--I know on my Windoze box that all image files (except .psd) will automatically open in my viewer, ACDSee (which recently became available for Mac, too)--no surprises, no metadata editors needed. If I want to edit it, I just right-click and choose the command "Edit" from the menu, which is set to open images with Photoshop. Same with .wav and other such--double-clicking opens in WinAMP, right-clicking and choosing "edit" opens in SoundForge. You can create any action, and choose any app to be associated with that action, for each file type--and then a list of all the possible actions for that file type will be displayed when you right-click a given file. But it will open in whatever your set to be your standard viewer, by default, if double-clicked. Much better than relying on hidden metadata. But even better and simpler than having to set up the actions and associations in the Folder Options dialog, is just using the Send To sub-menu that is brought up on right-click--just drop shortcuts to the apps you usually use into the Windows\SendTo folder, and those apps will appear on the Send To submenu when you right-click. That way I can easily open any file with any application, by using only one right-click and one left-click. In terms of launching files, it's like having the flexibility of a CLI, but within the ease-of-use of a GUI. That's one feature the Windows GUI actually got right, and got right very early on. MacOS can keep its metadata, but this is easier, simpler, better. I love the Send To submenu, though it's usually under-utilized by most people.

    I hate to say it, but the metadata folks are IMHO going the wrong way. I want more power and flexibility within my clicks, not less. I hate having to edit metadata when a simple three-letter change is all that would be needed in *nix and 'doze. And as I said, the advantages of metadata in terms of application/file association are entirely negated by the right-click menu and its Send To submenu in Windows, and similar functionality in some *nix GUIs. Metadata may have good uses, but none I can think of that can't be done more simply and elegantly. I also dislike the idea of my filesystem hiding things fom me, which unfortunately is exactly what MacOS does and what the newer NTFS in Win2k and up can do (I believe Ars had an article when Win2k came out about the new NTFS and some of the still-largely-unused metadata fields). Ext2 or FAT32 all the way, baby--and before you poo-poo FAT32, it may have almost no modern features, but it is straightforward, simple, and actually very fast in performance (thanks to the fact that it implements no real modern features); I recall it beating out NTFS in terms of raw speed in an old Ars article. Poor crash recovery is its main weakness.

    I like to keep things as easy to manipulate as possible. And contrary to what many make the mistake of thinking, file extensions are not just easy for CLIs--as I said, it makes sense in a GUI too, since it can be directly manipulated from within the GUI's file browser, without having to open the file in a metadata editor. It also makes the type of file crystal-clear--especially important if you don't want to accidentally run an executable that has an icon to make it look like a file. Unless OS X has some way which I haven't noticed to visually set executables apart from other file types, even when they're on the desktop or somewhere else that doesn't show details, I can't wait for someone to create lots of OS X viruses that have common file icons. That's already a case in the Windows world, where you'll find files called Report.doc.exe that have Word icons, but if you notice the trailing extension you won't mistakenly execute them (though the "show extensions for all file types" option isn't the Windows default anyomore, alas). How can you tell by a glance in OS X, or any other place where metadata rules instead of file extensions?

    Oh well. Windows may not have a lot right--but it does have its use of simple file extensions and simple context menus right. I always hated editing resource forks. It's just another *unnecessary* layer getting between a man and his hardware. Tell me one very useful thing that can be done with filesystem metadata, that can't be done easier and put more in direct control of the user. And before you say "labeling," like MacOS prior to X used to have--that's what folders/directories are for. :-)

    --

    Chasing Amy
    (We all chase Amy...)
    "The more corrupt the state, the more numerous the laws"-Tacitus
  33. dump file types! application binding is better by abde · · Score: 2

    if I understood the article, file types are bad because they get in the way of allowing the user to determine how to open and view files. The only real reason to want file types is closely related to application binding, IMHO - some users want *all* html files to open in Frontpage, others want to pick and choose on a per-file basis, most want something in between.

    But then why even *have* file types? You can survive quite nicely without them if you do have application binding metadata. Whenever you use an app to create a file, that file shoudl be bound to that app. If you want to subsequently open that file in a differrent app, then you shoudl let the app try. It's up to the APP, ot the filesystem, if it can open it or not. Why shoudln't you be able to open a JPG in notepad? if notepad has a hex viewing capability, it shoudl open just fine.

    a well-designed app shoudl let the user attempt to open any file. It shoudl try and interpret the data correctly, and it should allow the user to bind the file to the app if they so choose.

    IMHO the whole notion of file types is a mistake - the Mac approach seems to be, incorporate type as metadata, the windows approach seems to be use an extension. But neither is really necessary.

    as a final note - dumping file types avoids the "identical icon" problem that the author demonstrated in the screenshot. Simply use the icon for teh file that corresponds to the *binding* , not the file type.

    --
    Don't blame me - I voted for Howard Dean. http://dean2004.blogspot.com
    1. Re:dump file types! application binding is better by AndrewHowe · · Score: 2

      a well-designed app shoudl let the user attempt to open any file. It shoudl try and interpret the data correctly

      Isn't this the whole point though? Data is (are?) just data. It doesn't mean anything unless you know how to interpret it. It's like DNA. DNA is just a bunch of data. It doesn't contain anything saying "I'm DNA" or "Read me like this". That information is external to the data itself (themselves? :)
      Even on top of that, once you get to the file data, often there are multiple subtypes within it. For example, RIFF files are composed of chunks, each containing different types of data. As long as we consider files to be monolithic, opaque blobs, we're restricting ourselves.
      XML is... A discussion for another day...

    2. Re:dump file types! application binding is better by Anonymous Coward · · Score: 0

      Someone has said this already but...what if you create a JPEG in photoshop and then give it to your friend who doesnt have photoshop?! The OS (re MacOS) throws a fit thats what happens. Then you have to go an change the meta-data anyhow. With file extensions..none of that matters anymore. One machine will open a JPEG in Photoshop, another machine will open it in whatever program was chosen as the default for that file-type. Win2k handles it nicely in that you can have a default program to handle file-types or right-click and choose a specific program. simple.

    3. Re:dump file types! application binding is better by Gilmoure · · Score: 1
      what if you create a JPEG in photoshop and then give it to your friend who doesnt have photoshop?! The OS (re MacOS) throws a fit thats what happens.



      On most modern Macs, if they don't have Photoshop installed (as indicated by the creator code), the Mac will see from the type code that the file is a jpeg and will bring up a list of applications that support jpeg, as well as offering to open the file in applications that don't directly support jpeg, by way of translation with Quicktime.



      The problem with file types being appended to file names occurs when a user takes a Word file (listbday.doc) and renames it listb.day (or something equally stupid). If an app is solely depending on name extensions for file types, it's screwed. The user has just changed the file type without actually changing the data.

      --
      I drank what? -- Socrates
    4. Re:dump file types! application binding is better by abde · · Score: 2


      > It doesn't contain anything
      > saying "I'm DNA" or "Read me like this"

      sure it does - the "control codes" for DNA are embedded in between the genes. There are genes that contain "data" and there are "start", "stop" , and other more complex signaling all built in. DNA is a *bad* model for filesystems because the data and metadata are all in one long stream, but it works because it's a massively parallel system, and carefully and precisely regulated by enzymes (analogous to environment variables)

      --
      Don't blame me - I voted for Howard Dean. http://dean2004.blogspot.com
    5. Re:dump file types! application binding is better by AndrewHowe · · Score: 2

      No, the signals you talk about are simply strings of bits, fundamentally. Yes, a certain combination of three consecutive bases means "stop". But that idea is not itself embedded in the DNA. It's held in the DNA reader. That was my whole point. DNA does not contain metadata, it is just data. A "stop" code is just as much data as a "make this amino acid" code. Metadata is by definition something about the data. You could put it in the same "stream" but you don't have to, and it often doesn't make sense to do so.

    6. Re:dump file types! application binding is better by Anonymous Coward · · Score: 0

      >Someone has said this already but...what
      >if you create a JPEG in photoshop and then
      >give it to your friend who doesnt have
      >photoshop?! The OS (re MacOS) throws
      >a fit thats what happens.

      that's bullshit. MacOS shows you a list of programs that are capable of opening a file with the given file type. you chose one, it opens the file with the app. that's what happens.

      maybe you want to make sure you know what you're talking about next time you decide to badmouth something.

    7. Re:dump file types! application binding is better by abde · · Score: 2


      not true - there are also long striongs which do not code for protein but are "attachment" points which are where the transcription enzymes know where to latch on. DNA is processed by eternal readers and those readers look for codes embedded in eth DNA bitstream to decide when to attach and where to attach. Once they have attached, and started processing, THEN start-stop becomes relevant. There may be severall start-stop regions in one long patch. But HOW to process the data is embedded in the data.

      i recommend Stryer for a good biochem text...

      --
      Don't blame me - I voted for Howard Dean. http://dean2004.blogspot.com
  34. Is Siracusa a Mac bigot? by megaduck · · Score: 2

    From the article:

    Any part of the Mac OS user experience that exactly duplicates the experience on another platform ceases to be a compelling reason to buy a Mac.

    I totally disagree. I had absolutely no interest in Macs until OS X, and the reason I switched was because it acts just like a *nix. I can pull up bash, run emacs, grep, sed, awk, etc. Duplicating the unix experience was a very compelling reason for me to buy a Mac. Naturally, little things like Quicktime, games, and DVD support sweetened the deal. :)

    As far as metadata is concerned, I think that Mr. Siracusa is right. The current unix way of handling metadata sucks. Unfortunately, the future does not lie with the old "Mac Way", which is arguably a good deal more elegant. Steve Jobs knows this, which is why his new OS is based on unix, despite its' occasional warts (like file extensions). Apple has done what it had to do to survive in this new world. I just hope that a lot of the old Mac partisans will stop trying to cling to the past and join us for the ride.

    --
    This .sig for rent.
    1. Re:Is Siracusa a Mac bigot? by Apotsy · · Score: 1
      I had absolutely no interest in Macs until OS X, and the reason I switched was because it acts just like a *nix. I can pull up bash, run emacs, grep, sed, awk, etc.

      But you can do that for free on just about any hardware. Why switch just to get the same thing you already have?

      Naturally, little things like Quicktime, games, and DVD support sweetened the deal. :)

      Ahh, see? If it hadn't been for that you would have had no reason to pay the extra money for a Mac over a standard x86 box + Linux/FreeBSD/Whatever. You just confirmed what Siracusa was getting at. The things that the Mac does just-like-everybody-else aren't reason enough to buy it. It's the extra things that only the Mac does that make it worth it.

    2. Re:Is Siracusa a Mac bigot? by John+Siracusa · · Score: 2, Insightful
      From the article: Any part of the Mac OS user experience that exactly duplicates the experience on another platform ceases to be a compelling reason to buy a Mac. I totally disagree. I had absolutely no interest in Macs until OS X, and the reason I switched was because it acts just like a *nix.

      The fact that you can run Unix apps may have removed a reason for you to avoid Mac OS, but it is not a compelling reason to switch in and of itself. If Mac OS X acts "just like Unix", why would you switch to it from Unix? Obviously there was some other compelling reason to switch--something that differentiates it from other OSes that are also Unix or Unix-like. Those differences are what make people switch. Features that are the same merely remove those features form the decision making process.

      P.S.-If you read any of the reader mail from my OS X reviews, you'd know that I'm really a PC bigot ;-)

    3. Re:Is Siracusa a Mac bigot? by RFC959 · · Score: 1
      Is Siracusa a Mac bigot?
      Yes. :-) I went to college with him, and his login name on the school's Unix cluster was "macintsh", and we had many a flamewar there over OSes. But he is also a very intelligent and educated guy, and thoughtful enough that the term "bigot" shouldn't really be applied to him. (If you care to do a bit of searching, you'll find he's written a good number of articles on things Macintosh.)
    4. Re:Is Siracusa a Mac bigot? by John+Siracusa · · Score: 1

      ...and what was your login name, Mr. Anonymous?

    5. Re:Is Siracusa a Mac bigot? by Anonymous Coward · · Score: 0

      mets

    6. Re:Is Siracusa a Mac bigot? by Anonymous Coward · · Score: 0

      24601

    7. Re:Is Siracusa a Mac bigot? by megaduck · · Score: 2

      Good point. However, my infatuation with OS X stems not from any single thing I can point to and say "that's a Mac thing". The compelling reasons to switch were because it:

      • Gave me all my Unix tools. (Unix)
      • Gave me Quicktime, Photoshop, and a bunch of games. (Windows)
      • Gave me DVD support. (Windows)
      • Gave me two mouse buttons and a scroll-wheel. (Just about everything but the MacOS)
      • Gave me a command prompt. (Again, anything but the MacOS)
      In short, it allowed me to take the Unix plunge without losing all of the applications that I previously had in Windows.

      Naturally, after I got into it there were other things that I liked about OS X. I love the Quartz display layer and all of its' PDF goodness. I love Cocoa. I love the elegance of the dock. However, there's very few "Mac" things that I can point to and say, "I like that. Don't throw that out."

      Anyway, calling you a bigot was childish and I'd like to apologize for that. Even if I disagree on a few points, it was an excellent article and I encourage you (and the rest of the Ars crew) to keep up the good work.

      --
      This .sig for rent.
  35. Re:Before another non-insightful mac zealot posts. by generic-man · · Score: 1

    It's Really Easy (tm). You can either use ResEdit, an unsupported 68k application that Apple has all but disowned, or use some third-party utility.

    That's right. You need third-party software to change the file type on a Mac, and even then it's not easy. While Windows maintains a correlation between filenames and descriptions (.doc = Microsoft Word Document, for example) there is no such listing in Mac OS. True power users are just supposed to know that MSWD is a Word document.

    --
    For more information, click here.
  36. Creator/Type v. Extensions by Jeremy+Erwin · · Score: 2

    The author mentions that in CoreServices, two different Finders appear.
    Checking my /System/Library/CoreServices with terminal.app, I can see that one is simple called "Finder", the other is called "Finder.app". Changing my Finder view to "table" I can see that one is a "Application"; the other, a "Classic Application." So there are ways to differentiate the files-- though neither is quite elegant. The extensions are probably necessary for Nextstep compatibility.

    In Windows 95 & and successors, the GUI hides the extensions, and as the author points out, this can cause serious problems with vbs viruses. But what was left unmentioned is that it also is hard on programmers. If you can't tell the difference at a glance between "myclass.h" and "myclass.cpp", it really cramps your coding style...

    Microsoft also hides files that end in ".dll"-- which is a pain if you program libaries. This is somewhat more defensible, but not by much.

    Truth be told, although certain aspects of the Type/Creator code were far more elegant than enaything Windows 9X ever developed (Note to Adobe-- grabbing the .ps extension for Distiller is just plain rude), the immutability of the Creator/Type codes, save for ResEdit, is someaht inconvenient. I remember writing Applescript applications to change these codes en mass. Not exactly user friendly.

    1. Re:Creator/Type v. Extensions by Ben+Hutchings · · Score: 2

      For the sake of your sanity, it's a good idea to make sure Windows Explorer is configured properly before using a Windows account.

      The folder view options are accessible either by selecting 'Options' from the 'View' menu or by selecting 'Folder Options' from the 'Tools' menu, depending on version. In the 'Advanced Options' section of this dialog, you'll probably want to tell Explorer to:

      • Display the full path in the address and title bar.
      • Show hidden files and folders.
      • Not hide file extensions for known file types.
      • Not hide protected operating system files.

      The exact names of these options vary between versions; I'm reading these off Windows 2000.

    2. Re:Creator/Type v. Extensions by Jeremy+Erwin · · Score: 2

      I've made all those changes-- but in Windows, it seems that the choice lies between a interface that is mildly useful for programming, and an ugly/cumbersome one. The Ars technica article suggests that this is a false tradeoff.

  37. OS/2's EA's by SCHecklerX · · Score: 2

    These did the trick.

    On HPFS, they were stored as part of the file in the filesystem. You could copy the file to a FAT formatted floppy, however, and the EA's were stored as a separate file, allowing you to keep all attributes, including the long file name.

    1. Re:OS/2's EA's by Linegod · · Score: 1

      You betcha they did. There are still things about OS/2 that I miss, and EAs are one of them.

      --
      -- I care not for your foolish signatures.
  38. MIME and BeOS by Rimbo · · Score: 2

    Before /. went kablooey earlier today, someone pointed out that BeOS used MIME for identifying file types.

    As for the MIME example you give above, it is as much the job of Windows to add (or ignore) the extension to a MIME'd file as it is for Apple to add the proper extension. In other words, I'd say it's the Windows machine's fault for not recognizing the file for what it was, just as much as it was the Apple's fault for not adding the extension. Interoperability requires both sides.

    1. Re:MIME and BeOS by melatonin · · Score: 1
      I'd say it's the Windows machine's fault for not recognizing the file for what it was, just as much as it was the Apple's fault for not adding the extension. Interoperability requires both sides.

      The Mac is doing it's part. Every Mac email program I've used adds MIME types, they have no obligation to add file extensions. The Windows apps ignore that info though.

      --
      Moderators should have to take a reading comprehension test.
  39. Re:I HATE the MacOS and its stupid metadata! HATE by adjusting · · Score: 1

    > I know I'm practically alone in the Mac camp, but I hate metadata.

    I know a lot of Mac users(including me) who feel the same way. I think the metadata lovers are just a lot more vocal.

  40. BeOS Attributes by Splezunk · · Score: 1
    Well, the BFS system uses Mime type to identify files, but you can also add many attributes to the files. Example all my MP3 files have File attributes on them that contain information about the MP3. It is great, 'cause you can search your file system for those attributes. Extremely quick. Also you have the advantage of having a database at your finger tips in the form of a file system.

  41. Re:I HATE the MacOS and its stupid metadata! HATE by Apotsy · · Score: 2
    That's why on every MacOS system I use, I always get this. I cannot live without it. That, combined with this, solve the problems you describe quite nicely.

    On Mac OS X it's a little different, though. The "Types Change" plugin isn't available (yet?). But the "Open Using" plugin isn't really necessary, since you can force-open any file by dragging to an app in the dock while holding down command and option. Hopefully there will be a way to change the type and creator of a file on X soon, and all will be back to normal.

  42. Waaah! File Extensions are bad! by Owen+Lynn · · Score: 1

    Phew, that article was long-winded. And 5 pages later, we finally discover the argument he was building up to. File extensions are evil, windows is kludgy, and OSX sucks.

    Having not played with OSX, I have no opinion about it one way or another. Criticizing windows is easy, and many many people have beaten that dead horse over the years.

    Which leads us to the last point he was trying to make, which is we should all get rid of file extensions. I'd like to ask him, what glorious benefit would we get from removing file extensions in all OS's in exchange for all the disruption it would cause, just to satisfy his sense of aesthetics? I'm not saying it's a bad thing, but geez, there are bigger problems out there to solve than this. This is the equivalent of arguing over whether the turn signal stalk on the steering column should also contain a rotating knob on the end for wiper blade control.
    Who cares!?

    Sounds like the rantings of another Macolyte, trying to convince the rest of us that he's better than us. You might be right, but the world stopped caring long ago.

    1. Re:Waaah! File Extensions are bad! by Anonymous Coward · · Score: 0

      As you go through life you have probably noticed that people hate you. This isn't just because you smell bad, it has more to do with your tendancy to criticise things when you don't have anything constructive to add. This is related to your tendancy to tear things down because you cannot creating anything of importance yourself.

      Suicide is probably your best option, although you might consider joining the army or something.

  43. MIME Mess on Linux by GigsVT · · Score: 2
    I disagree with the idea that file extensions are a hack, I think that nature of Linux lends itself well to the idea that the file type should be encoded into the name of the file in a human readable form.


    What I do have a problem with is the splintered way that MIME is done in practice. Suppose I want my file type "foo" to be associated with a certain mime type and opened with my fooviewer, I would have to register my application/x-foo in:


    /usr/share/mimelnk/application/x-foo.kdelnk
    /usr/share/applnk/Multimedia/fooviewer.kdelnk
    /usr/share/mime-info/fooviewer.keys
    /usr/share/mime-info/fooviewer.mime
    /etc/mime.types
    /etc/mailcap
    /usr/local/lib/netscape/mime.types /usr/local/lib/netscape/mailcap


    I can't even figure out what the heck Mozilla uses for local MIME types... It apparently isn't any of these, in the version of Mozilla I have. I see it makes some nice XML files for user defined types, but those don't work with plugins.


    Why can't we just standardize on using /etc/mime.types and /etc/mailcap? I mean come on!

    --
    I've had enough abrasive sigs. Kittens are cute and fuzzy.
    1. Re:MIME Mess on Linux by Ben+Hutchings · · Score: 2

      Mozilla's file type database is a bit broken at the moment, but should end up using mime.types and mailcap under Unix, as earlier versions did. I can't find the Bugzilla number for this at the moment, but it's in there somewhere.

  44. File extensions have advantages too by prodos · · Score: 1

    The author of this article brings up a lot of interesting points, but I think one thing he failed to address was some of the advantages the file extension style of metadata continues to have over the other types discussed. I have been tinkering with OS X for the last few months, and while I have generally been pleased or tolerant of the changes I am experiencing coming from the windows/linux pc world, there is one issue that really strikes me as I read this article. I am a student at Stanford University which has all of its users home directories mounted on a very large AFS network along with countless other schools. This is a very useful feature as I can seamlessly use my home directory on almost every campus machine (be it Mac or *nix, no Windows implementation yet I'm afraid). Naturally, when an AFS implementation was released for OS X, I wanted to give it a whirl. You can imagine my surprise when upon mounting /afs in OS X, the Finder window that automagically opens for newly connected/inserted media almost immediately locked up. I thought it might just be a bad implementation in the AFS client, but after reading a few posts on mailing lists I figured out what was actually going on:

    ** Background **
    For those of you who are not familiar with AFS, it is a large-scale network filesystem which uses the domain names of the various server as sort of their filesystem roots. For example, John Doe's home directory at MIT might be /afs/athena.mit.edu/users/j/d/jdoe.
    ** Background **

    Apparently, when OS X connects to a network file system and opens the Finder window for that file system, it also goes out to check to see if there are any applications it should know about in that directory. Now here's the kicker, apparently the Finder doesn't just stop at checking for the .app extension (remember OS X apps are actually "bundles"/directories that have a .app extension) but it also figures it might as well look in every singled directory for an included metadata file named "Contents" which every OS X application is supposed to have. So basically, my machine was opening every single server root in /afs, connecting to something like 40-50 AFS servers *worldwide* to see if they were applications or not. Needless to say this can take quite a long time, especially on a slow connection, and could probably have been mostly avoided if it had simply respected the file extension all applications have to use anyway. I guess the morale of this story is, file system and seperate file metadata are all fine and good, but they can be a real pain in the rear for networking file systems if they are the only means of determining file type.

    1. Re:File extensions have advantages too by Anonymous Coward · · Score: 0


      So basically, my machine was opening every single server root in /afs, connecting to something like 40-50 AFS servers *worldwide* to see if they were applications or not. Needless to say this can take quite a long time, especially on a slow connection, and could probably have been mostly avoided if it had simply respected the file extension all applications have to use anyway.


      Can you say: Hello Novell Directory Services!


      It's a shame Novell is going down the tubes.

  45. Best of both worlds by Enahs · · Score: 2
    Gee, want to have everything MacOS has without modifying the underlying OS to support resource forks?

    1.) make sure apps hide file extensions, preferring, instead, icons
    2.) Hell, UNIX people use extensions like .tar.bz2 to signify bzip2'd tar files, right? Get ready for 8bim.tif. (for anyone curious, 8bim is the creator code for Photoshop docs on a Mac.)

    Really, there's the wonderful, superior data you get on a Mac when dealing with a Photoshop TIFF. "8bim" as the creator code (huh?) and the more sensible "tiff".

    Sure, sounds great. *rolls eyes*

    Sure, feel free to rip me a new one if I didn't use the proper terminology. I mess with ResEdit maybe once a year. :-P

    --
    Stating on Slashdot that I like cheese since 1997.
    1. Re:Best of both worlds by J'raxis · · Score: 2
      Hell, UNIX people use extensions like .tar.bz2 to signify bzip2'd tar files, right? Get ready for 8bim.tif. (for anyone curious, 8bim is the creator code for Photoshop docs on a Mac.)
      These two situations do not compare at all. Multiple extensions like *.tar.bz2 usually mean that within one file is another what I extract from the *.bz2 file is a *.tar. Neither BZ2 or TAR are ownership markers; theyre both types.

      This can happen on the Mac a lot; I have many *.sit.hqx files lying around those are BinHexd (a type of encoding) StuffIt archive files.
  46. Windows extensions can cause trouble by puetzc · · Score: 1

    I purchased a book on JavaScript about two years ago with an interesting bug. The book contained a CD with many examples. It was generally oriented towards Windows, but I assumed that there would be no problem in using the files on Linux and Macintosh systems. I was almost correct.

    To my surprise, most of the code did not work. Investigation revealed the all files were referred to as *.html in the code on the CD. In a misguided attempt to DOSify the CD, all files were stored with 8.3 formatted names. Windows happily went looking for the xxxx yyyy z.html, translated the name into xxxxyy~1.htm and loaded the file. Linux and Macintosh computers went vainly looking for the actual name referenced in the code. I had to copy the files to disk and rename them all in order to use any of the examples! I think that this is an excellent illustration of both the baggage inherited from DOS, and the author's point that the name is a poor place for important information.

  47. Re:I HATE the MacOS and its stupid metadata! HATE by Tachys · · Score: 2

    I agree kind of with this. But Mac OS X fixes this. In the file info there is a place for application. Where you can choose a application which opens only that file or opens all files of that type.

  48. ACLs by mattdm · · Score: 2

    Ok, first, the linux system of actually lookin' in a file to see what file type it is seems pretty un-idiotic to me.

    But more importantly, I strongly disagree with your point about ACLs. Different priviledge levels might be useful (as opposed to simple user-or-root), but I don't see a good reason to apply this to a filesystem. As it is, it's very easy to see quickly exactly who has what rights to what area -- with complicated ACLs, everything can get confusing and you might not notice a security problem. Sometimes simple is good.

    The private groups notion is far from an "ugly hack" -- in fact, there's no "hack" involved at all: it's just *using* the group and umask functionality in a nice elegant way.

    1. Re:ACLs by motherfuckin_spork · · Score: 0
      agreed.

      ACL's seem to be the root of some troubles we've had at work. It seems like it'd be failry straight forward to get privilages set correctly, but its really not that "logical" (for lack of a better term). I much prefer the way unix handles this - you are very right - simple IS good.

      --
      Nope, not me, I must be someone else...
  49. I absolutely agree by hypermanng · · Score: 1

    AC's dead wrong and it's just silly. I mean, thinking about file size for a second... if the system doesn't know how long that "ordered set of bytes" is, how does it know when to stop reading? You could have a file end marker, of course, but there's more metadata for you.

    It just goes to show that the loudly incompetent are too incompetent to have inklings as to their own incompetence.

    --
    I am the one true god. However, as an atheist, I don't believe in myself. I guess I have a self-esteem problem.
  50. Names describe things. by Phlegm_iBook · · Score: 1
    "There are three people in this room with me. Their names are: John, Sarah, and Matt"

    Just by telling you their names, you also know something about their types:

    John is male

    Sarah is female

    Matt is male

    Duh. A name is supposed to describe the object it represents. Including the type of an object with its name has been around a lot longer than file systems.

    1. Re:Names describe things. by AndrewHowe · · Score: 2

      Hmm. What type is Courtney?

    2. Re:Names describe things. by Seth+Milliken · · Score: 1

      You obviously don't live in San Francisco.

    3. Re:Names describe things. by John+Siracusa · · Score: 1
      "There are three people in this room with me. Their names are: John, Sarah, and Matt" Just by telling you their names, you also know something about their types: John is male Sarah is female Matt is male

      Or so you choose to assume! And what about Pat or Toby or Morgan?

      A name is supposed to describe the object it represents.

      If Matt changes his (or her ;-) name to Morgan, does his/her sex change as well?

    4. Re:Names describe things. by Phlegm_iBook · · Score: 1

      No. And if I change readme.txt to readme.jpg, does it in fact become a compressed image?

      I'm just saying that filename extensions are a natural extension (no pun intended) of common language made precise enough for a computer to understand.

      Then again, common language can be pretty dumb.

  51. wondering... by Stochi · · Score: 1

    it seems as though there are two main types of metadata: OS dependent and user dependent. the OS dependent metadata is information important to the system (permissions, ownership, filesize, etc). everything else (file type, etc) would be user dependent.

    i'm wondering if there would be a way to have the OS dependent data remain static (as it usually is), but have the user dependent data be variable. for instance, you could have a 'type' field, but you might also want an 'open', 'print', or 'edit' field to specify perhaps which applications would be responsible for those actions.

    for example, i might view images in GQView, but i can't edit them in that program. so, i'd create a new metadata field called 'edit' that would maybe point to the Gimp as the editor for that file. as long as the OS and/or programs i used adhered to this, life would be simple.

    since the user dependent metadata would be variable, not all files would need uneccessary metadata fields. but for those files where the end-user might want a broader range of actions associated with the file, many different possibilites could exist.

    of course, file-type information would not have to be the only thing stored in metadata. you could have version information, author information, authoring program, etc... but the metadata would only include that which the user would want included.

    just an idea...

  52. You're missing the point... by alexhmit01 · · Score: 2

    The point is that you like to see the type as a name next to the file. This is a question of interface, not of metadata. The SAME information for the extension could be shown there.

    You could sort by Type instead of "By Type - a hack off the extension".

    The point of this was that there is no advantage to storing it in a limited extension as opposed to meta data.

    Even more important (that the article went into) is that you can change the extension without changing the file. Think about that, you've changed what the type of the file is without altering the data? That is the point of keeping it as Meta-data, the type should never change without changing the data inside it to another type.

    Alex

  53. Apple Evolving by maggard · · Score: 2
    As usual John Siracusa brings up excellent points. However there are a few places that perhaps he's glossed over or disagrees with that I feel could be important:
    1. John argues that the OS can handle flattening files and creating file extensions when they are written to transports & filesystems that don't support the MacOS metadata properly.

      This relies on the MacOS always having appropriate mappings between filetype/creator codes and those annoying DOS extensions - not something that is always possible. Furthermore in an increasingly networked future it's not always assured that files will pass directly in & out through the OS but rather will likely just as often come & go through alternate transports, all of which would have to all be rewritten to support this. As this enforced-extension functionality is already standard in many applications it seems reasonable to simply codify it there then rewrite everything else, particularly as the creating application will have far more insight into the appropriate extension then the OS could.

    2. John argues that the user should always have control over a file's naming and not the OS, yet acknowledges that renaming-with-extensons will often be required in a networked multi-OS environment.

      Personally I would always prefer any extension-addition be made and clearly communicated when I explicitly create a file and not later when it passes in and out of MacOS-metadata-supporting networks and filesystems. Just as John is appalled at the proposal for hiding these extensions from the user's view I'd be appalled at their being automagically added to my file's names at some later date when they may get moved around or viewed from another OS. At least when I name a file "whiz" and the application insists on creating it as "whiz.bang" I know about it, I don't find out later that my "whiz" is that on some servers and "whiz.bang" on others or it's "whiz" for the other Mac users and "whiz.bang" to the *nix & Wintel folks.

    3. Finally John views the possibility of Apple moving from it's MacOS X HFS+ native filesystem to some other with alarm; I see this as evolution.

      HFS+ is a fine filesystem but it's unique in an increasingly unnecessary way. Other more modern filesystems are being created and if MacOS X is to remain current it needs to keep up and take advantage of these advances. Journaling filesystems are poised to become a standard feature of modern *nix implementations - should Apple lock themselves out of this? Furthermore it's not obvious that new filesystems will necessarily obviate the MacOS-metadata (ReiserFS seems particularly well poised to eventually incorporate much of this) but preparing for all eventualities seems wise.

    Apple no longer lives in it's own comfortable bubble. It's now a peer OS in world increasingly sophisticated and fast-moving. Having grafted MacOS's strengths onto the Next operating system Apple has now entered the rejuvenated unix environment and needs to compete not only on it's own terms but also on those of the other modern operating systems.

    While it remains important to retain those strengths that have made MacOS such a survivor it's also necessary to not hobble it with dependencies on unnecessary Apple-only limitations. Flexibility is the order of the day and this includes some reasonable level of filesystem versatility. Apple already supports a variety of filesystems now it's come time to allow for the possibility of multiple "native" ones while retaining much of it's vaunted metadata strengths.

    --
    I don't read ACs: If a post isn't worth so much as a nom de plume to its author then I wont bother either.
    1. Re:Apple Evolving by John+Siracusa · · Score: 1
      John argues that the OS can handle flattening files and creating file extensions when they are written to transports & filesystems that don't support the MacOS metadata properly.

      Actually, one of my points was that "flattening" is no longer necessary in OS X (in the future, anyway) since resource forks are now deprecated.

      This relies on the MacOS always having appropriate mappings between filetype/creator codes and those annoying DOS extensions - not something that is always possible.

      Yes, the same way it is possible that a particular Windows machine will not know what a ".qyz" file is. No matter what the file type metadata system, there will always be "unrecognized" file types even when the file type metadata is present.

      Furthermore in an increasingly networked future it's not always assured that files will pass directly in & out through the OS but rather will likely just as often come & go through alternate transports, all of which would have to all be rewritten to support this.

      ...or users who want to use such protocols can simply choose to have their apps append file name extensions. Again, my objection is to making them mandatory in OS X.

      And yes, long term, all network protocols should support extensible metadata :-)

      Personally I would always prefer any extension-addition be made and clearly communicated when I explicitly create a file and not later when it passes in and out of MacOS-metadata-supporting networks and filesystems. Just as John is appalled at the proposal for hiding these extensions from the user's view I'd be appalled at their being automagically added to my file's names at some later date when they may get moved around or viewed from another OS.

      See above. Configuration is desirable. Forcing extensions everywhere is the big problem, just as forcing the lack of extensions would be, were some OS to do so.

      Finally John views the possibility of Apple moving from it's MacOS X HFS+ native filesystem to some other with alarm; I see this as evolution.

      The alarm is based on the speculated move away from file type metadata stored separately from the file name. I have no particular attachment to HFS/HFS+. Any file system with (preferably extensible) metadata support will do :-)

  54. Resource forks != meta data by Gorimek · · Score: 2

    Mac files do take some extra space, but it is the resource fork that causes it, not the meta data. Since the resource fork is implemented as a separate file, a small Mac file uses 2 disk sections, not one. The type/creator meta data is only 8 bytes.

    The article addresses the common misconception that the resource fork is meta data at some length in http://arstechnica.com/reviews/01q3/metadata/metad ata-6.html ("Second, I mention it because...")

    I agree that handling metadata in MacOS should be easier. Just a simple command to view and edit them would solve most problems. But don't confuse that lack of tools with a fundamental problem with the nmeta data concept itself. And as someone else pointed out, there are rpetty good freeware tools available. to fix this.

    1. Re:Resource forks != meta data by Chasing+Amy · · Score: 0

      Well, where I said "I hate editing resource forks" toward the end, I should have said "I hate editing metadata." And though type/creator and the resource fork are separate, they are usually manipulated with the same tools.

      However, the first time I said "resource forks" that's exactly what I meant. It's the resource fork that takes up so much extra space on HFS and HFS+ filesystems. And the resource fork is there to store, contrary to what Siracusa says, metadata, not usually data. I think it stems from a weird Mac-centric definition of data he has.

      The resource fork is not generally used to store any data within a data file type. An executable file type, and some others, store important *data* in the resource fork, without which they often cannot function. However, any plain data files are stored in their entirety in the...data fork. Go figure.

      Things like labels and customized icons are stored in the resource fork. I think these things are clearly metadata--data about the thing in the data fork. For example, the fact that a piece of data is labeled with one of the classic Mac label colors, or that the user wants the file displayed with an icon other than the standard one for that file type, is clearly metadata--not at all an essential part of the file. Unlike what Siracusa said, all basic data file types which I can think of are perfectly usable even when sent from a Mac without their resource fork. Another Mac may not know what to do with them, but on other systems giving them an appropriate extension will immediately let the PC know what they are and what to open/edit/etc. them with if they're clicked on. For example, a ClarisWorks file still has all its data intact, even if the resource fork goes away. A text file will still be intact. A Stuffit file will still be un-stuffable, though I've noticed on OS 8 at least that if the resource fork is not present double-clicking a .sit gets no results, it has to then be manually dropped onto Expander (so it must store *something* interesting there, but nothing absolutely essential). A GIF is still a GIF, etc., and perfectly usable. No essential data is stored in the resource fork for most types of non-executable files. It's almost all inessential metadata, whether Siracusa wants to admit it or not.

      --

      Chasing Amy
      (We all chase Amy...)
      "The more corrupt the state, the more numerous the laws"-Tacitus
  55. The Mac Way by piecewise · · Score: 2

    After using the Mac for 10 years (and the PC alongside... and a lot of Linux as well), and after reading the mentioned article, i must say the Mac way is the best way -- in a closed model. For years, it's been so conveinent: no file extensions, nothing to worry about. Heck, even I add ".txt" to a Photoshop file, it'll still open with Photoshop correctly.

    Add to that Windows.
    Now the Mac has to be aware of Windows files. It's a Mac "control panel" called File Exchange. If there's a file without a type/creator metadata (which the Mac depends on, in part), File Exchange says, "Hey, that's a Windows file that ends in .psd. That's really a Photoshop file!" So, it opens then with Photoshop. Alright sir, 'nuff said.

    Add to that networking/internet.

    Now the Mac not only has to worry about file extensions, but also its forks (data fork / resource fork). So if I send a program over email -- even to another Mac -- the result will be garbled data that won't work. I have to first convert the Mac file to MacBinary -- which squooshes together the forks. On the other side, it can be uncompressed into a two-fork program again and it works perfectly.

    Eh, sorta annoying, but I compress things I email anyway because I hate emailing huge files.

    Mac OS 9 gives me no problems. Files work right over PC networks, etc. Mac OS X works even better over networks -- in fact, in my work it is much smarter and works more efficiently than Windows NT (or even Linux).

    The problem? Now Mac users have to worry about file endings --- in a sense. Applications use the Bundle methodology. Works great!

    Files, on the other hand, *sometimes* need extensions.

    When don't they?
    If the file is opening in a "Classic" application (meaning it is being run through the old mac os 9 codebase... though it's not really "emulated"). Because those "old" files HAVE type/creator codes the Mac understands.

    When DO they?
    If it's purely a Mac OS X file, for the most part. Now in 10.1, the file endings can be hidden. But that doesn't solve the real problem: the Mac is battling PC/Unix files from the net AND its original OS 9-and-lower files that now have to carry redundent metadata.

    Apple really needs to solve this. I know a lot of the OS X programmers, and they're extremely committed and bright, so I'm sure the problem will be fixed. Most importantly: make the Mac work great over the 'net (which is does really so far), AND make the experience very easy. I HATE file extensions. I love the old type/creator method. But I'm sure it could be done even better to satisfy all.

    --
    The next comment I write will be ready soon, but subscribers can beat the rush and see it early!
  56. MS will be moving to a database core by alanjstr · · Score: 2

    A few articles at The Register have talked about how MS will be "building SQL Server into the OS - effectively making the file system a relational database." This will greatly improve efficiency, and concentrate them into one database format instead of a mixture. And its even probably legal.

    1. Re:MS will be moving to a database core by King+Babar · · Score: 2

      A few articles at The Register [theregister.co.uk] have talked about how MS will be "building SQL Server into the OS - effectively making the file system a relational database." This will greatly improve efficiency, and concentrate them into one database format instead of a mixture. And its even probably legal.


      Words fail me here. There are many beautiful reasons why a relational model makes perfect sense for a file system, and, for that matter, why the notion of "file" itself might be changed for the better in such a system. And, if there is any advantage to monopoly, it is that a tyrant with good ideas can actually make them work.


      The problem is, of course, that the good idea is the notion of a relational file system. SQL is, was, and forever shall be a disgusting hack. Everybody who does research in databases knows this, and almost everybody who works with SQL on non-trivial things knows this, too. And MS knows this as well. Indeed, they could, if they wanted, implement the relational model in a nice, clean way and support it with a query language that's less Cobol-esque and they could make everybody use it.
      The problem is, though, that this doesn't maximize market share by ripping apart its competitors like kleenex pinatas. To do that, you have to first embrace, then extend, then spend a decade to clean up the mess you made in the first place. In other words, SQL Server in the OS is the quintessential Microsoft move. But, hey, at least it does solve the file associations problem in a slightly less dorky way...

      --

      Babar

    2. Re:MS will be moving to a database core by core10k · · Score: 1

      Damn, you (and Microsoft) just gave me a hardon. That's a *great* idea.

  57. Re:I HATE the MacOS and its stupid metadata! HATE by Anonymous Coward · · Score: 1, Insightful

    I think a lot of the metadata haters just don't know what metadata is doing for them. Every time you hear someone say "Mac OS is more elegant", they really mean that it doesn't keep popping up dialog boxes telling you off or warning you you're about to break something. Much of the time this is because that information is being stored and so applications just generally appear to know more about what's going on, so they don't have to ask the user.

    Not having a satisfactory method for modifying metadata is hardly an argument against having metadata; it's just an argument for having a satisfactory method of modifying metadata. There are countless utilities on the Mac for doing this. All Mac OS X needs is to show a field for file type along with the field for filename and permissions and such in the file Inspector.

    The argument the article makes is that we shouldn't just throw away all the metadata that's already attached to files just because it's inconvenient to store it on legacy filesystems.

  58. Re:I HATE the MacOS and its stupid metadata! HATE by Fred+Ferrigno · · Score: 4, Interesting

    This isn't a problem with metadata, just a problem with MacOS' file typing.

    BeOS handled all this very well. Double click to open with the default app. Right click to see a list of every program on your hard drive that opens that kind of file or files like it. (IE, a text editor would show up as an option for an HTML file.) Choose another option and open a dialog to set a file-specific preference.

    I must have said "BeOS did it better" about six times today. I feel like an Amiga user.

  59. fundamentals by TheWoundedSeagull · · Score: 1

    "Files" are an abstraction.

    "Files" are used for both data transfer and data persistence.

    The brilliance of Unix, was, that everything is a file. Which allows amoung other things for a kind of closure ( i.e. cat fred | awk | sed...).
    It is a very useful abstraction.

    The abstraction has limitations.

    When transfering data - you have to describe what the data is (metadata). The description must be in terms of a common frame between the transferring parties. or the data must be "self describing" at the file level or the protocol level.

    When storing or persisting data - you want it to hang around reliably and efficiently. There are layers of implementation and interface. The requirements of these interfaces are not the same as for data transfer.

    I think some of the reasons for these difficulties is taking the file abstraction too seriously, and not considering the difference between storage and transfer.

    i.e. I agree with the conclusion reached by the article - other OS implementations are bleeding into OS X in the name of interoperability. They dont have to, you just have to know that there is some sort of transformation between the files that are stored and the files that are transmitted.

  60. FIle size as metadata by HalfFlat · · Score: 1

    No the AC is not wrong on the issue of file size.

    There are two sorts of information at stake: one is data which is an adjunct to that contained in the file; the other is information that is a function of the data in a file. Size is a function of the file data, which is an ordered set of numbers. Sure, this number has to be stored somewhere, so that the files can be meaningfully read - but then so do the bytes of the files themselves! Storing the size is simply an implementation concern - the size is an intrinsic property of the data.

    The more information you can extract from the file contents itself, the better in my opinion. Metadata - as the article pointed out - can easily be mangled or lost. In this regard, I'm a big fan of the default Unix scheme of /etc/magic, though I do strongly believe it could be improved. We can't throw out all our old file formats, but we can certainly agree on a standard form of file preamble, which say contained a mime type or other globally recognized unique type identifier, to be applied to new formats developed. Some mechanism for users to be able to extend the /etc/magic system would be nice, too.

  61. Re:I HATE the MacOS and its stupid metadata! HATE by Frymaster · · Score: 2
    What a pain it would be to have to use a metadata editor instead of just manipulating three letters in filenames

    tell application Finder
    set creator type of file foo to "8BIM"
    set file type of file foo to "EPSF"
    end tell


    run that in smile and your problems are solved... or, you can just use snitch.

  62. Re:BeOS has already SOLVED the FileType/Metadata p by Anonymous Coward · · Score: 0

    I agree, the BFS is one of the most impressive things I ever layed my eyes on.

  63. That picture is wrong by Anonymous Coward · · Score: 0

    Commander Taco is not circumcised. That's why he smells so bad...

  64. Re:I know who you want! by Anonymous Coward · · Score: 0

    LoL!!!! Somebody mod this UP!!!!

  65. Be already discovered that was a bad idea. by drewness · · Score: 1

    That is an interesting, but probably not good idea. Be tried basically that idea with their first file system, and it was pathetically slow. BFS has database-like elements, e.g. the journal and the indexed metadata in key = value format. Dominic Giampaolo and his team at Be learned the hard way that putting an actual database in the filesystem is not good. Practical File System Design with the Be File System (ISBN 1-55860-497-9) is a good read about how filesystems work, and only requires a basic knowledge of C to understand. He talks about BFS, ext2, XFS, NTFS, and HFS (focus on BFS of course) and how they address several design issues. I recomend checking it out.

  66. file types, mac vs windows by doom · · Score: 2
    Just wanted to point out that under the Mac system, the fact that you can click on two different files of the same type and end up in a different applications can actually be *tremendously* confusing to a naive user. This capability isn't necessarily something to be proud of.

    Similarly, the windows system, with file extension associations that are essentially a total mystery to the average user is also tremendously confusing. You can install some lame-ass scanner software and have it decide that it owns all the image file types you used to have associated with photoshop. Now, how do you get back to normal?

    The point that I'm making is that doing almost anything "automagically" has the potential of being a source of confusion. UI designers need to think a little bit more about empowering the user rather than just concealing things from them. Obscurity != ease of use.

    (I strongly suspect that hiding file extensions by default was a really bad idea.)

  67. Re:Apple sucks by Anonymous Coward · · Score: 0

    You failed to mention that it is your picture up above.

  68. Oops, forgot to add this... by Chasing+Amy · · Score: 1

    > I agree that handling metadata in MacOS should be easier. Just a simple command to view and
    > edit them would solve most problems. But don't confuse that lack of tools with a fundamental
    > problem with the nmeta data concept itself.

    The problem is that we already have a much simpler and more elegant solution for the most commonly used metadata, type/creator. I do not think it is at all possible to make it eaiser to access and change type/creator from a GUI, than it is to access and change the file extension. So if there is no advantage to type/creator metadata, there would be no reason to use it at all. I spent the better part of my post above showing why type/creator metadata causes problems when files are exchanged, and how it is just as easy to use the context menu or Send To submenu in a Windows system to launch different files of the same type in different applications. Therefore, I see no reason to have type/creator metadata in the first place, since simple typing by extension with a context menu available on right-click is at least as effective, yet provides more flexibility in terms of the context menu commands or Send To menu apps available to launch the file, and easier and faster access to manipulating the file's actual type.

    I think the problem is a narrow-mindedness on the part of *some* Mac users, who have not cared to even learn the Windows way of doing things and so think the Mac's metadata is some huge advantage when in fact the Windows way is easier and more flexible if you give it a chance. The MacOS simply lacks an equivalent context menu and Send To submenu--the context menu it has isn't as powerful and flexible, and so cannot do the job, and so Mac users see anything that relies on a context menu to be suspect if it could be done without. Of course, Apple would have to adopt a standard 2-button mouse before it could implement a system that relied on the centext menu for opening files in alternative ways, and Apple is so unfortunately wedded to the 80s notion that more than one button would be too confusing for new users. Funny how new Windows users seem to get the hang of it...

    Point is, I am a Mac fan, but use Windows too and can accurately judge the features of each. And I can say that Windows' simple file extensions are eaiser to use and manipulate, while its adaptive right-click menu gives you almost all the advantages of type/creator metadata with even more flexibility and none of the drawbacks. I'm glad the newer Linux GUI environments are mimicking this aspect of Windows, rather than the metadata of MacOS. As I said, I cannot think of a way to make the metadata as accessible through the GUI as a file extension is.

    --

    Chasing Amy
    (We all chase Amy...)
    "The more corrupt the state, the more numerous the laws"-Tacitus
  69. Linux isn't following Windows, it's following UNIX by mj6798 · · Score: 1
    Linux isn't following Windows. Linux is following UNIX. UNIX deliberately has flat, simple, unadorned files. This isn't out of ignorance, it is out of careful consideration of the alternatives. If you want to build some metadata scheme on top of that, you can. Many applications do.

    The most common scheme for metadata support under Linux is to treat directories and directory trees as units of information; Linux end-user applications unfortunately don't take enough advantage of this. Another common scheme now is to use XML. Yet another approach is to use a relational database for metadata.

    What UNIX/Linux is missing is file change notifications and efficient support for lots of small files. ReiserFS looks to change that.

    Windows NT and its successors have, in fact, database-like functionality and multiple forks in their files. I very much hope Linux will not be following either MacOS or Windows down this path. UNIX was in part created as a rebellion against that kind of creeping featurism.

  70. why use files... at all? by orangesquid · · Score: 2

    Why does the file name have to be there?

    Why are we still using file systems?

    Computers are quite capable of managing huge organized trees of data. Why are we still fighting with bitstreams like this?

    --
    --TheOrangeSquid Is it any wonder things seem so awry? We swim in a sea of confusion and don't have to think to survive
    1. Re:why use files... at all? by Slur · · Score: 1

      Would you care to propose an alternative method or two?

      It appears that computers and storage devices are still in a state of linear processing. Bytes are fetched and stored in linear space and time. Filesystems are organized in a tree-like structure, but they have to be traversed in linear steps.

      Are you holding out for some new paradigm?

      --
      -- thinkyhead software and media
    2. Re:why use files... at all? by orangesquid · · Score: 1

      ACtually, some of the models from object-oriented programming and hte like for managing data may be a good place to start...

      --
      --TheOrangeSquid Is it any wonder things seem so awry? We swim in a sea of confusion and don't have to think to survive
  71. Other ways to store type information by rpk · · Score: 1
    There are two ways to store type information in a file that are like more sophisticated versions of file(1):
    • If the data is XML-formatted, you could look for something like a DTD or schema reference.
    • In OLE-land, a DocFile is structured as a container of typed files, recursively. The type of the top-level container can be considered the type of the file. DocFiles themselves start (or end) with a distinctive signature.
    You can still use Mac OS X in "the old Mac style" in terms of not paying attention to extensions, but it's sad to see the backwardness of the rest of the world finally taking its toll. Perhaps it is time for a standards group to come up with some minimal metadata interchange ideas and apply them to data transfer protocols. It's pretty obvious that MIME would be one of the building blocks.

    And, on the broader subject of metadata, I think that at least two of the native Lisp Machine file systems (one of which was written by RMS) allowed for user-defined metadata in the form of a property-list, so you could store any property on a file as long as the data that was PRINTed could be READ back in.

  72. Re: BeNews by Eugenia+Loli · · Score: 1

    Hardware failure I think. :(

  73. Think again. Or perhaps just think. by hypermanng · · Score: 1

    Unless a string of data is floating about in a vacuum, with nothing else to be read, there is nothing to distinguish the bytes in that string of data from bytes in OTHER strings of data. If you add an EOF marker so the system knows where to stop, that's additional information. If you add DCB information with the requisite BDWs and RDWs like on an MVS system, then no EOFs are necessary because the systems counts until it has as many bytes in memory as it was told were there to be had and stops.

    SO, file size is most definitely metadata. There's no way of knowing what is in a set of data unless you have additional information which is not the data itself.

    Even is this was not true(say, in the data-in-vacuum example), the fact is that META data is "data about data". As others have said, just because the information is intrinsic to the set doesn't mean it is the same as the explicit data itself. I can tell by browsing the hex of an AFPDS file that it's an AFPDS file, so I could write that information somewhere else close by, and it would be metadata, even though it was derived. The simple fact that it is an AFPDS file does nothing within the context of an AFPDS file. By the time it's being processed AS an AFPDS file, the fact that it's an AFPDS file is pretty irrelevant and changes processing not a bit. If it was really a saved character from Everquest that happened to also function as a valid AFPDS file, nothing would change about the result.

    As long as you think a file preamble/header/footer/whatever is NOT metadata because it's "in the file", then you're never going to understand what metadata really is. Do you see?

    I admit to being an asshole. But that doesn't mean one don't deserve to be pilloried when one tries to pretend one knows one's shit better than the guy writing the article - and fails.

    And btw, if you don't know my (rather obscure)acronyms, just ignore them. Their particular meaning is not important to the argument.

    --
    I am the one true god. However, as an atheist, I don't believe in myself. I guess I have a self-esteem problem.
  74. Re:BeOS has already SOLVED the FileType/Metadata p by kaeru · · Score: 1

    BeOS does this so well, that I use it as my MP3 juke box. The ability to search using a GUI, in milliseconds for songs from a particular artist, or title, running on an OS that boots up in seconds means that there will always be a copy of BeOS in my house for this purpose to make use of old second hand Pentiums that people think are too slow for their latest and greatest copies of Windows.

  75. A note from the asshole by hypermanng · · Score: 1

    The reason why I'm being such a jerk about this is that John is being so cool and really quite nice to all you self-impressed punks. Since he's not willing to give you the what-for, someone should write in service of the scales of justice. I know I'm not the only person on /. tired of people imagining themselves such experts that they can talk about "fallacies in fundamentals" and lay down the law. ArsTechnica is generally scrupulously accurate and it's the height of pomposity to be like that when you don't actually know what you're talking about. It's damned disrespectful of soemone who's trying to share his knowledge with you FOR FREE!.

    So be quiet.

    --
    I am the one true god. However, as an atheist, I don't believe in myself. I guess I have a self-esteem problem.
  76. Amiga: IFF and Data Types by xixax · · Score: 2

    While AmigaOS used .info files to keep track of things like applications, I did like the way IFF was used as _the_ file format. Sounds, images, they were all was stored in IFF files that kept track of what exactly the file held. Sort of like a bundle.

    Then there was Data Types. The theory was that if an App knew about data types, it didn't need to know how to write a particular format as long as Data Types did know. I liked this idea. If a new format came around, I didn't need to update all my apps (as long as they knew how to use Data Types).

    Xix.

    --
    "Everything is adjustable, provided you have the right tools"
  77. You're limiting your view on file type by MO! · · Score: 2
    Sorry, but from my perspective, the association of a file type that is unable to be changed without changing the data is wrong for the simple reason that a file may have more than one type!


    Yes, index.html is an HTML formatted file - yet it's also a TEXT file - and it may contain Java/PHP/etc Scripts embedded in it. Do I want to limit a file to only one "Type" when it can be many, depending upon how I choose to use it at a given moment? Of course not!


    It's true the older MacOS file type methodology worked well when the Mac was used as a much more limited system. The fact that MacOS X includes Apache is an example of how much more versatile the Mac is today. In order to be more versatile, you have to reduce some things to the LCD. To do otherwise limits the usage of the system and that's exactly why the Mac has remained the cherished possession of so few. To expand market share, Apple needs to expand the uses of the Mac - that means more flexability, and less stringent ties to those technologies (however useful in old-school aspects) that limit that flexability.

    --
    I AM, therefore I THINK!
    1. Re:You're limiting your view on file type by John+Siracusa · · Score: 1
      Sorry, but from my perspective, the association of a file type that is unable to be changed without changing the data is wrong for the simple reason that a file may have more than one type! Yes, index.html is an HTML formatted file - yet it's also a TEXT file

      Both hierarchical types and file type accuracy/resolution (which is what you're getting at) were addressed in the article.

      It's true the older MacOS file type methodology worked well when the Mac was used as a much more limited system. The fact that MacOS X includes Apache is an example of how much more versatile the Mac is today. In order to be more versatile, you have to reduce some things to the LCD. To do otherwise limits the usage of the system and that's exactly why the Mac has remained the cherished possession of so few. To expand market share, Apple needs to expand the uses of the Mac - that means more flexability, and less stringent ties to those technologies (however useful in old-school aspects) that limit that flexability.

      To quote the article, "any part of the Mac OS user experience that exactly duplicates the experience on another platform ceases to be a compelling reason to buy a Mac." Furthermore, regressing to more primitive metadata just as the rest of the industry progresses to more sophisticated metadata (e.g. MS's Blackcomb/SQL file system rumblings) would be a very bad move.

    2. Re:You're limiting your view on file type by Anonymous Coward · · Score: 0

      >Yes, index.html is an HTML formatted file - yet it's
      >also a TEXT file - and it may contain Java/PHP/etc
      >Scripts embedded in it. Do I want to limit a file to
      >only one "Type" when it can be many, depending
      >upon how I choose to use it at a given moment?
      >Of course not!

      no. it's always a text file. it's the contents of the text file that are some html code, some text code or your cv. the program doesn't *have* to know about the contents of the file, just the format. nothing keeps you from ending a file in .html on a mac, and some programs (like bbedit) will make use of the ending, for example for syntax coloring. but the programs don't *need* to do it, and all programs that can read text files can read html files or c-files or your cv, without having to know about html, c or your life.

      that's what's so great about type codes!

  78. Just not true by Auckerman · · Score: 2
    "The part that still bothers me, now that capacity is no longer a substantial issue, is that in Windows or *nix I can instantly change file types from the interface, but not with Mac. It comes up a lot--many times a day."


    In the current shipping version of MacOS (read X) you can "Show Info" on a file (same thing as in "Get Info in OS 9) and there is a pull down menu that gives the option application. Its very easily noticed. There you can reset not only the meta data for that file, but also, the default for all files of that type. You no longer need to use file typer. Anyhow in OS 9 this was a non-issue. If you wanted to open something with a different application, all you had to do was drag the file onto the icon for the app you want to launch it. You could even do multiples.


    "I hate downloading things on a Mac because of this. Some idjit will have a file set to open in an application I don't have"


    This is deceitful. If you download a normal everyday file off the internet, odds are not going to get the meta information of that file, even if it was orginally on the Mac. Almost every single application for downloading on the Mac uses the System level settings for mapping extensions to type/creator, which will ALWAYS be an app you have. The only times you will get meta info with the file is when you 1. download a stuffit archive and 2. use hotline. I personally can't remember the last time I downloaded a file as a stuff it archive, unless it was an installer, which in that case it was irrelevent since it was self contained. Now if you are heavily using Hotline for exchanging files, then odds are you are a pirate and deserve what you get.

    --

    Burn Hollywood Burn
    1. Re:Just not true by Chasing+Amy · · Score: 1

      > you can "Show Info" on a file (same thing as in "Get Info in OS 9) and there is a pull down
      > menu that gives the option application. Its very easily noticed.

      I know. I don't have OS X on any of my own machines yet, but I've played with a lot. In general, I really love it. I think it's a wonderful evolution of the interface I came to love years ago when I was running System 7 on a scrappy little PowerPC at something like 66MHz. But my point is that in Windows all one needs to do is highlight the filename and change the three-letter extension. No dialog box and pull-down menu necessary. Faster and easier. IMHO, of course.

      > If you download a normal everyday file off the internet, odds are not going to get the meta
      > information of that file

      Maybe I'm weird, but when I used to download with Macs, it happened all the time. Particularly with graphics files. You are of course aware that type/creator is not in the resource fork for Mac files, and so if you download a graphic created with Mac Photoshop, it will usually want to open in Photoshop if you have it installed? Same for Graphic Converter files, and others. That's just my experience. I'm not being disingenuous at all. I also haven't downloaded anything with a Mac since OS 8.1, I now use a Windoze box for my vanilla downloading duties. The case may have changed, but from what I've read from others, doubtful.

      > Now if you are heavily using Hotline for exchanging files, then odds are you are a
      > pirate and deserve what you get.

      Umm, don't you find that an ironic statement from someone who has the .sig line "Burn Hollywood Burn"? ;-) So, pirating from Hollywood is the only kind of pirating that's good? hehe.

      --

      Chasing Amy
      (We all chase Amy...)
      "The more corrupt the state, the more numerous the laws"-Tacitus
    2. Re:Just not true by jafac · · Score: 2

      no, I never know whether my porn's gonna open in JpegView, QuickTime Picture Viewer, or Photoshop - and that's all with Jpeg files, with JpegView icons.

      --

      These are my friends, See how they glisten. See this one shine, how he smiles in the light.
  79. I want thumbnail metadata by steveha · · Score: 2

    One thing that bugs me: when a program (e.g. Nautilus) builds a thumbnail for an image file, the thumbnail isn't attached to the file, it is stashed somewhere (e.g. in a hidden directory called ".thumbnails" or something like that). This is a hack.

    It's worse when you have multiple programs that want thumbnails; there isn't a standard yet and you get multiple thumbnails.

    What I really want is some metadata attached to the image file, and the thumbnail in there. Then when you copy or move the file, the thumbnail goes along. And of course we need a standard so all the programs that want thumbnails will all do it the same way.

    steveha

    --
    lf(1): it's like ls(1) but sorts filenames by extension, tersely
    1. Re:I want thumbnail metadata by eram · · Score: 1

      This is exactly what the traditional Mac OS resource fork is used for in many graphics programs. The large image is stored in the data fork, which means that if I transfer the file to another operating system, it is still usable. In addition to that, there may be images suitable for preview or Finder icons in the resource fork.

      Another example of using the resource fork is that plain text editors store my text in the data fork. In addition to that, the resource fork is used to "remember" which part of the text was highlighted when I saved the file and possibly other preferences that I made for how to edit the file.

  80. Local filesystem metadata only the beginning by mlinksva · · Score: 1
    For files that are useful in many places (like those traded on gnutella and the like), metadata that is not only external to the file, but external to the filesystem/machine will be useful. What files do you trust, what files are "best", what is this file I have, really?

    I'm biased though, as I'm working on a global metadata repository.

  81. Re:I HATE the MacOS and its stupid metadata! HATE by opxe · · Score: 1

    Its not metadata which takes up that space, its hidden resource files. The mac metadata really only is 8 more bytes then other OSes. It is a different issue.

    But if there is one thing I intensely dislike about MacOS, it's the metadata. I know I'm practically alone in the Mac camp, but I hate metadata. I have always thought it was just a space-hogging pain in my ass. Now, the space issue is no longer a big concern since we have such big, cheap drives that a little filesystem metadata isn't such a burden on capacity. But back in the days of floppies I was pissed that I could fit so few files on a floppy when my friend with DOS could fit noticeably more. I was especially annoyed that even when I formatted a disk as a PC floppy, the Mac would still waste my space by creating and hiding from me files and folders on the disk to constiture the resource forks. I wanted every kilobyte, which counts when you're cramming a lot of small files onto a lot of small disks.

  82. "File Type" is actually two different things by crashdavis · · Score: 1

    What the author of the article is missing is that what he calls "File Type" is actually representing two (or more) things.

    1. The type of data represented by the file. This can be used to help an application know how to load the data, or more importantly, to help the USER know what is contained in that file.

    2. The application which the user prefers to view/edit this file with. This depends on the user, on the machine, on what apps are installed, etc.

    IMHO, he is too tied to the one-type-one-app view of the world. For example, take C++ programming. Most people use a directory for a project. In that directory, there are numerous files many of which have the same names. You end up with Foo.cpp, Foo.h, Foo.obj, Foo.exe, etc.

    In a world where all those files are called "Foo", you end up with a very confusing directory listing. That directory listing to be usable has to display the type attributes anyway in order to enable to user to know what to point and click on.

    Another alternative would be to disallow files with the same names in the same directories, so what happens to those files? One option would be to rename the files "Foo Source" and "Foo Header" etc. but this is cumbersome and stupid. Another option would be to offload the helper files into other directories, but then you end up encoding type information in directory names instead of file extensions, which I'm sure we'd all agree is even worse.

    Another example of how this is messed up would be transferring files from another computer. If I transfer Foo.cpp from one Mac to another, the Creator information comes with the file. What if the file was created on emacs and I use vi? Or What if I use MS Visual Studio? What if I uninstall MSVS and install the Borland IDE? Do all my files break? Now I can't launch anything?

    What application to launch really doesn't have much to do with the creator field. And file extensions have other uses besides indicating a launch target.

    --
    "The difference between theory and practice is small in theory and large in practice..."
    1. Re:"File Type" is actually two different things by John+Siracusa · · Score: 1
      What the author of the article is missing is that what he calls "File Type" is actually representing two (or more) things.

      1. The type of data represented by the file. This can be used to help an application know how to load the data, or more importantly, to help the USER know what is contained in that file.

      2. The application which the user prefers to view/edit this file with. This depends on the user, on the machine, on what apps are installed, etc.

      IMHO, he is too tied to the one-type-one-app view of the world.

      Er, which article are you reading? I specifically make the distinction between the existence of type metadata and the application binding policy--several times! :-)

  83. Metadata and its problems by Animats · · Score: 2
    Actually, this isn't about "metadata", it's about data attributes. Real metadata describes data, rather than just identifying it. An SQL schema is metadata. An XML DTD is metadata. A file type is an attribute.

    The main issues regarding file attributes are "how do you find it", and "what do you do with it when you've found it". UNIX users are used to thinking of these as being tied strongly to file names, but that's not fundamental. Consider Microsoft's Fast Find and Active Directory, for example. Or the way the MacOS handled applications; it didn't matter where they were, because there was a database (the "desktop") that automatically tracked them.

    Much of the complexity associated with UNIX programs involves finding their various parts. This usually involves some combination of path variables, configuration files, command line options, shell scripts, and fragile directory tree structures. Something better is needed.

    A separate issue is whether files should be flat streams of bytes or should have structure. The Mac had both; the "data fork" of a file was a byte stream, and the "resource fork" was a tree of records, rather like the Windows registry. This was a good idea, although it suffered from the fact that the machinery for updating the resource fork was prone to corrupting it, which discouraged its use for dynamic data storage. (Or, as Apple put it, "the Resource Manager is not a database.")

    Many of Apple's better ideas suffered from what Mac developers called the Mess Inside. One major effect of this was that things that required keeping complex data structures consistent didn't work too well. Application bugs could corrupt the desktop or resource forks, and the system-level machinery which processed those data structures didn't check them. ("It's more fun to be pirates" - Steve Jobs.) So data tended to turn to mush, which gave resource forks a bad name. Today, most programs store all the important stuff in flat files.

  84. you don't know your history by mj6798 · · Score: 1
    The problem with UNIX is an LCD (lowest common denominator) and designed by committee problem.

    UNIX was not the first OS, nor was it designed by committee. In fact, it was designed by a small research group in reaction to the excesses of systems like MULTICS (hence the name). Today, Windows NT follows in the MULTICS footsteps, with excessive APIs, redundant functionality, and a special-purpose API for any conceivable application. The lessons of UNIX are as relevant today as they were 30 years ago.

    Compared to Win2K, Linux's technical advantages are pretty minor.

    The advantage of UNIX (and to a more limited degree, Linux) is absence of features. Microsoft cannot catch up with that because they are moving in the wrong direction.

    Microsoft has moved on, it's important for the UNIX community to do so as well. ACLs (implemented on NT) are FAR more flexible than users/groups.

    How naive can you be? Do you think people at Bell Labs didn't consider ACL-like mechanisms? Don't you think they would have added them by now to Research Version 10 or Plan 9 if they thought this was the right thing to do? The creators of UNIX have never been constrained by committees or backwards compatibility: they have always done what they thought was right, and they changed things when they believed they had done something wrong. ACLs, so far, haven't cut it for them. You may disagree with their technical judgement, but don't attribute that disagreement to some nebulous notion of being outdated or old-fashioned.

    HOWEVER, the system of making things conveniently obvious for the CLI results in engineering decisions that give the OS less flexibilities. GUIs can provide TREMENDOUS ammounts of information BECAUSE the user decides when to get that information.

    As the Windows NT file properties dialog shows, GUIs don't help one bit with this problem. Yes, you can present lots of information in a GUI, but users can't process it any better than they could with any other method of displaying it. In practice, if you actually allow users to maintain their own ACLs, you end up with a complete mess of permissions on your hands.

    Let's get back to the overall point. Of course, there is a need in the world for systems like MULTICS and Windows NT. There has always been, and there always will be. That's not because they are any better designed or any more modern, but because such systems satisfy the preferences and tastes of the masses of programmers. But that doesn't make such systems well-designed. As far as I'm concerned, systems like UNIX and Plan 9 are the systems for the thinking programmer, and MULTICS and Windows NT are the "lowest common denominator", catering to the masses who don't know any better. Which is also why I observe with a lot of concern the attempts to turn Linux into Windows and add all that Windows junk to Linux. As far as I'm concerned, that's not progress.

  85. Creator and file types by mfnickster · · Score: 1

    As others have pointed out, BeOS did this right by making this part of the interface instead of like a Mac where the type/creator info is hidden from the user and not editable without downloading additional software.

    Just to nitpick-- you can change the file/creator types using Applescript, which comes installed on every Mac. That doesn't mean, of course, that your average end-user is going to do this.

    The way it's supposed to work, you can drag your icon onto the application you want to open it with, and then save it from within that app. Presto, the icon changes, and you never have to see a 4-letter code. My problem with it is that there's no easy way to de-associate a file from all apps.

    - MFN

    --
    "Slow down, Cowboy! It has been 3 years, 7 months and 26 days since you last successfully posted a comment."
  86. Re:I HATE the MacOS and its stupid metadata! HATE by SeanAhern · · Score: 1

    Remember what the author of the article said: Just because you are offended by a particular behavior that an OS does, doesn't meant that it's the fault of the metadata itself.

    Having a file type stored in the filesystem itself, rather than in the filename, only means that the interface to present and change the information has to adapt.

    For instance, on the Mac, if I want to open a document with a different application than the one it's bound do, I right click on the file and choose a different app under my "FinderPop" popup menu. It's a tiny piece of shareware that let's me do this. Granted, I'd like it in the OS itself, but it's seamless for me.

    Changing the type, I have to admit, really is a pain in the ass on the Macintosh. I hate having to pull up a special program and type in some special codes to do this. What a pain.

    But many things can be gotten around, assuming that the file system will support it.

  87. Re:Think again. Or perhaps just think. by HalfFlat · · Score: 1

    Soon we'll have to take this to e-mail :)

    One could define metadata as any data associated with some other data, but isn't that a bit broad? It really does encompass everything from word count to plot description. Surely a more practical definition is data which is not just associated with, but additional to the original data in question.

    That's a debatable point. In the broadest sense, then yes, size is metadata, but then so is the byte-frequency histogram of the data, or the number of odd-length words found in the data, and so on. The most broad definition of metadata is too broad to be useful. Perhaps there is a comprimise position, but I can't see anything wrong in working with that outlined above.

    With the issue of size: sure, it has to be stored somewhere (or equivalently, with an EOF marker.) So do the bytes of the file. So do the error-correction bits stored on the hard disk (for example.) My point is that the singly-forked unadorned file - in the abstract - is a finite ordered set of bounded numbers (typically bytes). How that set is represented physically is an implementation concern, and will have to include an encoding method for the bytes themselves, probably some error correction at the lowest level, and of course the size, too. Finite sets by their very nature are delimitted somehow - if they weren't they wouldn't be finite!

    The file-type of a file might be stored seperately (as metadata); it might be derivable consistently from the file contents (as metadata). It's not an intrinsic property of the ordered set of bytes the file represents - it's entirely up to interpretation within the computing context. This is unlike size. Size doesn't care what the content is. Size is a necessary and constant property of any finite set of data.

    To take the example of the AFPDS data, the fact that you can tell it is an AFPDS file relies on information external to the simplest file abstraction. It requires an external context, that of the AFPDS file format, for the file type to be derived. There's nothing intrinsic to the ordered set of numbers that makes it so; it could well be weather-data to some other software.

    I also need to address this statement:

    SO, file size is most definitely metadata. There's no way of knowing what is in a set of data unless you have additional information which is not the data itself.
    The size of the data is part of the data itself - well, more precisely - once you have described a set of data, you have also described its size. To be repetitive, storing the size, or an EOF marker, or somesuch is simply part of the representation of that set. Once could represent a file by a single huge integer, and then store that integer on disk, for example, impractical though it be. (Of course, the actual representation of the integer on disk would probably in the interests of space efficiency be delimited in some way.)

    PS: I had never before encountered the acronym AFPDS. I looked it up in an acronym index, but it wasn't very enlightening :).

    PPS: One can extend the abstract file notion to include streams, which are ordered possibly infinite sets of bytes (or whatever). Take for example, /dev/zero under Unix. In such a context file size doesn't make a lot of sense, but this is only tangentially related to the argument.

  88. Re:Before another non-insightful mac zealot posts. by mfnickster · · Score: 1

    That's right. You need third-party software to change the file type on a Mac, and even then it's not easy.

    It's as easy as dragging the file icon to an app that can read it, then choosing "Save." Don't have an application that can read the file? PCs aren't immune to that problem either.

    If the file type is incorrect, that's when you need to mess with the codes. And you can use Applescript to do it, no need for third-party tools.

    While Windows maintains a correlation between filenames and descriptions (.doc = Microsoft Word Document, for example) there is no such listing in Mac OS. True power users are just supposed to know that MSWD is a Word document.

    So is that ".doc" file a Word 6 document? Word 3.0? Word 97? They all fight over the same extension, but the file formats are different. This example is far from being a rare or unusual case.

    - MFN

    --
    "Slow down, Cowboy! It has been 3 years, 7 months and 26 days since you last successfully posted a comment."
  89. I'm running low on asshole. by hypermanng · · Score: 1

    The definition of metadata isn't "useful information about other data" or "data about other data we've decided to store" or whatever. It's just "data about other data". Yeah, frequency histograms are usually pretty useless as metadata, but they are still metadata. I just wouldn't have the system track it.

    Another definition for metadata that is insufficiently conclusive is "data outside the fiel describing data inside." What you decide to call a "file" is really a question of convention. They are not necessarily contiguous on disks, for example, and are generally the logical representation the system offers to an application. If the system knows which bytes to offer, it's already read the file size somehow. In mainframe environments they don't even talk about "files" most of the time, generally preferring different ways of representing datasets. So, as you were mentioning the different ways of "representing" sets, you were showing that you actually agree with me but you keep stumbling over the PC baggage of the "file" term. Perhaps it was a bad choice to include it in the conceptual part of the article, since it tends to lock people into a way of thinking.

    As for the AFPDS thing - I really don't remember what the hell I was trying to show there. Doing too much crack, I guess. Maybe I'll figure it out later. In the meantime, AFPDS is a flavor of mainframe printstream. I spend a good portion of my day slogging through data and metadata at the byte level, troubleshooting printstream transforms. Mainframe print environments have impossibly convoluted metadata structures.

    --
    I am the one true god. However, as an atheist, I don't believe in myself. I guess I have a self-esteem problem.
  90. ONGRTLNS? by Webz · · Score: 1

    What does C:\ONGRTLNS.W95 mean?

    1. Re:ONGRTLNS? by Anonymous Coward · · Score: 0

      Congratulations Windows 95

  91. Automatic translator by Aapje · · Score: 1

    On MacOS 9 you usually get a nice window which lets you pick an app to open the file with.

    Files that come from the Internet or a PC automatically get the proper type and creator (most of the time).

    It's also quite easy to change the type/creator for a power user. You should not act like all Mac-users are dumb.

    --

    The Drowned and the Saved - Primo Levi
  92. Why not databases? by nograz · · Score: 1

    Weve been struggling with filesystems for a long time now. But IMHO filesystems for themselves are some sort of "hack". In the end its all about data, so why not faciliate applications which are "experts" in handling data: databases! This of course sounds like the wording of an ORACLE representative: "everything is a database". But hey, at least with files theyre right. Why we still dont put our data in databases is still a mystery to me ...

  93. What I hate about UNIX filesystems by anarkhos · · Score: 1

    No creation dates!

    only atime, mtime, and ctime.

    And we all know, if it wasn't designed 30 years ago, UNIX won't adopt it.

    --
    >80 column hard wrapped e-mail is not a sign of intelligent
    >life
  94. Suggestion: Metadata on traditional file systems by tequesta · · Score: 1
    I think the reason for Apple to move to file name extensions is that MacOS X is basically Unix. All those BSD tools can't handle any other metadata than filename, permissions and access/mod dates. Still, it should be possible to access or modify all metadata information from scripts or the command line. It would have been possible to write accessor tools for that, but it looks like Apple chose the easy way.

    Which brings me to all the other operating systems of choice, which may partly have file system support for metadata, but certainly not application or tool support. For that to arrive, file system support for metadata must be pervasive --- otherwise, applications would have to decide

    • What file system is this file going to be saved on?
    • Does the file system have metadata support?
    • If yes: Which API is needed to store metadata?
    • If not: Can the file be stored without metadata?
    ...etc ad nauseam. Now, if we do agree that metadata is essentially a good thing, we ought to be able to save it on any proper file system (as a consensus of minimal functionality, I'd suggest long (128-char, at least) filenames without special characters except pathname separator), and it ought to be accessible by regular Unix tools for portability and backwards combatability.

    I'd suggest the way that MacOS does it with its applications -- simply package everything into a directory. That directory could contain a file with a fixed name, describing what type file this is (Apple uses "Info.plist" in XML), and any number of data streams -- one, perhaps, for the binary file data itself, a couple for resources, etc. etc. The resulting dictionary (I'll call it a "package") can be manipulated by legacy tools, and there could be library support for the format to make it easy for new tools to make full use of it.

    The only problem I see with this is that this directory would have to be encoded for transmission over the net; but that's the problem that John wrote about, and could be solved by simply making a tar file of the directory. That even has the added advantage that there's tool support for tar files on just about any platform, which is more than can be said about MacBinary.

    Any comments?

  95. I hold up this bad idea for ridicule by Slur · · Score: 1

    In other words DumbName.jpg and DumbName.txt should not be allowed in the same folder. Then hide all the file extensions and the users would be none the wiser. Yes I would. Especially in the terminal, where I want things to behave like they should bloody behave. In other words DumbName.jpg and DumbName.txt should not be allowed in the same folder. Then hide all the file extensions and the users would be none the wiser. Yes I would, didn't you hear me the first time? Those are two different files with two different names, and I don't care what you hide! In other words DumbName.jpg and DumbName.txt should not be allowed in the same folder. Then hide all the file extensions and the users would be none the wiser. Fine, be that way, but I'm never going to use a nutty OS that acts like that. And don't ask me to use an OS that has a universal handler for files that end in .jpg either. If it doesn't have an application specified then give me a list to choose from, don't go assigning things to it for me.

    --
    -- thinkyhead software and media
  96. Object-oriented filesystems by nwetters · · Score: 2, Interesting

    The current problem is that filesystems don't make it easy to store properties of a file. The HFS made a brave attempt by dividing files into content and properties (using data and resource forks), but it still didn't objectify the filesystem. For example, creating children of a particular file necessitated converting the file into a folder and then wondering what the hell you were going to do with the folder properties - were they going to be placed within the folder or within the parent folder.

    A solution would be an object-oriented filesystem, that allows every file to have children without nasty conversions, and implements a simple store for properties (a Berkeley DB file would seem a natural solution).

  97. Re:I HATE the MacOS and its stupid metadata! HATE by Anonymous Coward · · Score: 0

    > I agree kind of with this. But Mac OS X fixes this. In the file info there is a place for application. Where you can choose a application which opens only that file or opens all files of that type.

    Did you actually tried it ? I mean, half of the time it is greyed, half of the remaining time opening the 'Choose application' dialog freezes the Finder. When it works, it generally forget to save the option.

    Face the truth:
    1/ Mac OS X Finder is a piece of shit.

    2/ The ars technica writer is overly verbose

    3/ He is a mac weenie, and love everything apple did.

    4/ File type were wrong since the start. Is HTML a text file or a HTML file ?

    5/ File type as implemented by the original mac are even worse (An editor that can open TEXT file cannot open an HTML file)

    6/ Don't get me started on file creator.

    7/ Any idea that "an utility exists that can alleviate the problem" is stupid. It works out of the box, or it doesn't.

    8/ File extensions are a bad idea. But those are necessary. NeXT, for instance, implemented those correctly.

    9/ Resource fork are even worse than file types

    10/ Standard are good.

    11/ Converting from names to extensions when 'externally storing files' is not such a good idea. It is the kind of ideas that make parts of Mac OS X unworkable on UFS.

    12/ Making comparison between file type and file size is mindless advocacy.

    13/ A gzipped text file is not a GZIPped file. It is a GZIPed TEXT file.

    14/ Interoperability imposes the use of file extensions. So file extensions should be used.

    15/ Mac OS type/creator were painfull in 1986 (when I started developing for it). It is much more painfull today where my personal mac is networked with several FreeBSD boxes and a windows one. And when my Mac can boot 4 operating systems.

    16/ file(1) is good. The OS should make use of something similar (ie: each app-maker should register its extensions and provide a machine-readable description of how its file type can be recovered from content). This database should be maintained up to date, and used by the OS when file type is unknown.

    Cheers,

    --fred

  98. LAN's need file types too by Anonymous Coward · · Score: 0

    When I first read through this article I thought.. yeah that makes sense.. but sharing files over a LAN is much more important for my business than over the Internet.

    At least once a week we run into a problem where a PC programmer has to go to a Mac Artist to find out what kind of file was just sent.... this is useless communication.

    In addition it's very useful to look at a folder full of files and at a glance know which are .psd (source files) and which are .gif (final output).

    1. Re:LAN's need file types too by Ashok · · Score: 1

      Why not share the files over something that is MIME-aware? (and use systems which are too)

      That it's your LAN doesn't stop you using sensible protocols used in the wider, wilder Internet.

      --
      ash
      ... You can call it a wizard once it can do bloody magic
  99. nice, but I think you missed the point by twitter · · Score: 2
    name.c and name.h are two text files that may have been created by vi, emacs, gnotepad, KDE's advanced editor, vim, la la la, the list is very long and includes automagic code generators with yet undetermined names. Why bother trying to store this info, when it's so obvious from the name extention? Oh yeah, that's the way we've always done it, so I must be stupid.

    Sometimes I want vim, sometimes I want gnotepad, sometimes I want something else. I never want Word, and I don't want some stupid meta data setter telling me I do. No, thank you, DickBreath.

    --

    Friends don't help friends install M$ junk.

    1. Re:nice, but I think you missed the point by DickBreath · · Score: 2

      I never want Word

      Niether do I. But it was the first example to come to mind. Its beside the point.

      Normal apps, i.e. emacs, vi, etc. aren't giong to set either the type or creator. So everything still works the same. You still type the same commands, just as always. If the type/creator is not set, extensions cuold always be used.

      Why bother trying to store this info, when it's so obvious from the name extention? Oh yeah, that's the way we've always done it, so I must be stupid.

      So why should I use this "automobile" thing when the horse and buggy is highly developed?

      And I suppose that Linux users will just always have to put with a second-rate end user experience, because of what you want.

      Its obvious to anyone that if man were meant to fly, he'd have wings. And its obvious that the earth is flat.

      --

      I'll see your senator, and I'll raise you two judges.
  100. and I almost forgot by twitter · · Score: 2

    Like I said, I never want Word. It would really upset me to Work on a project with some dickbreath who used word, if his modifications would make name.c open up that way from seeing the brilliant created with metadata.

    --

    Friends don't help friends install M$ junk.

  101. the real issue by curious.corn · · Score: 1

    Distinguishing filetype based on extension is what many of us are used to. So a .h file is a header and a .c is code. How can one tell them apart if there's no extension? Well... modified ls would do the job. Even changning from one type to the other wouldn't be a problem: 'chtype file1 HTML' If you get used to the idea that extensions don't belong to the name you could easily 'vi index.html' to create a file 'index' (just name) with associated filetype HTML in some other database.
    'ls *.c' becomes 'ls -T x-csrc'
    Actually it all boils down to ditching the idea of a FS as nodes/leafs. Way cool would be that a make install on an app would mean simply changing it's type from plain object code to systemwide executable without even moving it from it's physical location on the hd. It's a tremendous change... I can't even think of all the consequences. One though would be fenomenal:
    how do I distinguish filetypes if they don't have extensions? I open the file, read the magic ID in it's header, compare to /etc/magic (or /usr/share/magic), and close. Very expensive, but just 'SELECT type FROM utonto WHERE name=p0rn' is quite faster right? Installing new kernels would just require recording it's inode on the kernel-type table. LILO just parses it and present the list of available ones. Libraries? Just the same and it actually isn't conceptually different from moving the bits from one place to the other in the filesystem.
    Creating /dev/*? Oh well, just a matter of adding entries to the db. Publishing a page on apache simply means adding it to the apache-table (and this is what we conceptually do when we copy the file in /var/www/html) Folders and files are just an implementation of database storing that was familiar to people used to filing cabinets. It's just A way to
    do it and perhaps there are more efficient ones. Rumors say M$ is going that way, BeOS did it long ago and for UNIX the switch wouldn't be traumatic (if you had code to map user files to a fictious ~/ or systemwide-lib table to /lib)

    If it ain't broke don't fix it: true. The filing cabinet is a metaphorical interpretation to data storage that produced the current filesystem architecture; it works but we could grow out of it and move to a object-relational dB system that could provide enhancements to data sharing (NFS anyone?) and access control.

    --
    Mi domando chi à il mandante di tutte le cazzate che faccio - Altan
  102. Re:Ummm... MacOS X Does that, dude. by Frobozz0 · · Score: 1

    Sorry to crash your party, but the MacOS does exactly what you are talking about... it allows you to choose which application you open a specific file with, in addition to setting the default for that file type. Very handy when you get a file from a web site without a file extension, but you KNOW it's a .zip archive...

    --
    "Politicians find new names for institutions which under old names have become odious to the people."
  103. Re:Linux isn't following Windows, it's following U by rhavyn · · Score: 2

    Linux does have file change notifications, it's just not integrated with the mainline kernel yet. IIRC, it's SGI's Irix implementation which they opened for Linux.

  104. Creators vs. Editors by nquartz · · Score: 1

    One of the reasons I advocate Macs for my print centers is that it does track both the file type and the creator code. When your primary job is handling incoming files for output, it's essential information to know if that EPS was created by Illustrator, Freehand, QuarkXPress or a WindowsNT Postscript print driver. It tells the output technician what program to open and what problems to check for.

    But don't treat a "creator" as an "opens by default" tag - that's not what it's there for. For my part, I wish the creator code had a paper trail. If I knew that this EPS file was created by QuarkXPress, pulled into Freehand, exported to illustrator and then rasterized in Photoshop, I'd know a heck of a lot more about why it's not printing the way the customer expects.

    I can see that in other environments this information may be superfluous, and the default opening application might rather be set to something else. The best point the author of this article made was that how a file opens should be easily configurable by the user - either open in the native application, or open in something else. Let us decide. But the metadata that supports that action is immutable, and should be saved with other immutable data, not with the stuff that can change irrespective of the data.

    --

    --Any sufficiently reliable magic is indistinguishable from technology.

  105. I am aware of my history, I'm not talking about it by alexhmit01 · · Score: 2

    I'm talking about a modern UNIX, meeting the resent specifications (I believe UNIX 98 is the most recent).

    The ORIGINAL UNIX was designed well for its time. I would suggest that there is little in common (code wise) between the original UNIX and modern UNIXes. The BSDs and Linux share NO code with the original. While the commericial UNIXes may share some code, overall the have been rewritten.

    Now, each vendor makes their own UNIX. The "Linux" distributers make their own system, though the kernel and the C-library appear to be standardized.

    The point that I am getting at, a modern UNIX is a well engineered machine. However, when you write portable code, you write either to the specification for all UNIXes (the designed by committee standard that hasn't been updated in 3 years, and most of the standard is MUCH older), or you write to a particular UNIX. Yes you can write a fall-back to the standard and optimize for your platform, but you really haven't helped then, have you? Your platform is optimized, the rest just run.

    Take a current issue in the BSD community. The BSD rc system is beautifully simple, and less of a mess to manage than a SVR4 system. However, it does have some problems in that there is no easy way to bring down/up services, etc. We wrote our own service script (inspired by Redhat's, but custom to us) that we edit the configuration for as we add services. This gave us some control of the system.

    NetBSD wrote a new rc system that uses shared scripts and small configuration files for each daemon. It seems to be an intelligent system, without the mess I have seen in Linux systems.

    FreeBSD appears interested, and will likely adopt it.

    For reasons that AREN'T clear if they are technical or political, OpenBSD will not. This means that people writing to the BSDs can choose to support two systems (the NetBSD/FreeBSD AND OpenBSD), or target one platform (either NetBSD/FreeBSD OR OpenBSD).

    Now in this case, the rc system, it is less of a concern. Targetting both is trivial. However, as most people use FreeBSD or NetBSD, those of us in the OpenBSD community are part of a smaller niche.

    However, we see the potential for a problem. Targetting the BSDs requires targetting BSD 4.4. Any extensions made by the BSDs may or not be ported to the rest. Playing nicely with Darwin/MacOSX is another problem, but Apple seems to be willing to follow FreeBSD's lead on the BSD side, focusing their creative/decision process on the Aqua/Quartz side.

    To get the BSDs to move foward, you need a committee that shares improvements (unlikely for political reasons) and gets them standardized when everyone moves foward.

    IDEALLY, you can target your platform of choice, but the features you want will be in the other platforms within 1-2 years. That way, you can even have a fallback for lagging systems. However, there should be a process by which useful extensions become part of the base, so the base is slowly evolving.

    I understand your preference for a less-is-more approach, which I think makes sense for the system/kernel. That design philosophy encourages better systems. However, when there is a way to do something, there should be an attempt to develop the best way, then everyone should use it instead of reinventing the wheel.

    The core UNIX system has two standards, POSIX and X11. Both of those systems were designed separately, and you can target them and it will work everywhere.

    If you want a GUI app, targetting X11 is a pain. Targetting Motif is slightly more pleasant. Targetting Qt is easier. However, you can't target ANYTHING but raw X11 until you know that the standard requires something.

    That is what I'm getting at. Motif was adopted as the UNIX standard long ago, but it's non-free status has kept it out of Free systems.

    Java promised to be that universal platform that we could all target. In an ideal world, there would be no more native code for applications, all new applications would target the JVM or something similar. Java offered the idea that MFC did for Windows programmers, a real environment to build applications in. Unfortunately politics and other limitations got in the way, and we still don't have a universal platform. Targetting all UNIXes is okay with an LCD approach, targetting all UNIXes and WinNT requires a different approach. But nothing out there presents a good way to move the common base foward so I can build an app that is optimized for everything, or at least will be when everyone catches up.

    Alex

  106. Re:I HATE the MacOS and its stupid metadata! HATE by Anonymous Coward · · Score: 0

    And that's easier is it?

    Go back to your Jobs-worshipping you prick.

  107. Ever had to handle ACLs (Access Control Lists)? by osolemirnix · · Score: 1

    ACLs (Access Control Lists) are a very nifty feature in a network environment. They provide a much better control for the end user.

    But nobody ever uses them. Why? Because there is no real standard (just a draft), every Unix (Solaris, HPUX, IRIX, AIX) and WinNT on top of it uses its own scheme, and they're all incompatible.

    So part of the problem is standardization. The Mac file TYPE has the same problem. While Apple theoretically is the single authority on this, there are many non-registered file type fields in use and it keeps getting worse.

    Guess what happens if two applications use the same file or creator type, but with different content? - Ay problema!

    --

    Idempotent operation: Like MS software, wether you run it once or often, that doesn't make it any better.
  108. Still not true by Auckerman · · Score: 1
    "But my point is that in Windows all one needs to do is highlight the filename and change the three-letter extension. No dialog box and pull-down menu necessary. Faster and easier. IMHO, of course."


    Current shipping version of Windows (ME) does not allow that. You have to right click on the file and get the properties dialog, because....the extensions are hidden. Second the current version of Windows has a VERY archane dialog for changing default apps for a given extension, compared to the elegance of OS X, is next to useless.


    "You are of course aware that type/creator is not in the resource fork for Mac files, and so if you download a graphic created with Mac Photoshop, it will usually want to open in Photoshop if you have it installed?"


    Just not true. Netscape, IE, all every FTP app I can think of all reset the type and creator on download, if it is there at all. They use the builtin database in MacOS for determining the type/creator, which will ALWAYS be set to an app you have. As I said the ONLY times I know this will happen is when you 1. Download a stuff-it archive and 2. Download from a Hotline server that is Mac based. As I also said, I can't recall the last time I downloaded a file as a stuffit archive and Hotline is for pirates...

    --

    Burn Hollywood Burn
  109. Which Metadata is arbitrary/unneccesary?[diatribe] by aphor · · Score: 1

    A filesystem is a service. It (should) provide storage and retrieval of any string of bits that will fit. The only essential (key) metadata is the filename/location (address) information. Dates and permissions/ownership are metadata that the filesystem provides for itself, and related file-management software.

    File type metadata is redundant. For proof, I offer the example of steganography software. The essential part of steganography is the ability to look at data which appears to be unusable and yet somehow extract data which conforms to our expectation. My point is that file type metadata can be computed from the data itself, and therefore any stored designation is arbitrary and unneccesary.

    You might think "why not talk about magic numbers?" Let's do: steganography can be computationally intensive. If you don't need to obscure the data's purpose, why not embed a label in the data itself? Oh! We do! Most (well designed) file format specifications include such a designation in the first few bytes of their internal structure.

    Theoretically, a filesystem can do its job without knowing anything about the file contents. Oh! Some filesystems do! Utilities which manipulate the data served up by the filesystem cannot read and write files without prior knowledge of the file's contents. Which (application or filesystem) should be authoritative on file types? That's right: applications!

    What happens when you ask the filesystem to choose which application to handle any given file? Then the filesystem needs prior knowledge of *all* file types for every file. It needs some kind of file type metadata on every file type. Windows handles this with the registry each time an application is installed: it tells the filesystem what kind of files it uses and how to recognise them (by yucky filename extensions). But, wait! That's Windows Explorer I'm talking about: a utility/application NOT the FAT32 or NTFS filesystem. Unix has a similar function called "magic". It is also an application/utility and not part of the filesystem. Even NeXT bundles (meta-files?) in MacOS X (and GnuStep) place the onus for file handling on the applications.

    The file browser is an application, and it has needs which are a subset of filesystem needs, a subset of each application, and a set of its own peculiar features' dependancies. The Mac Finder, Windows Explorer, and every X-Windows filesystem browser (browser for short) has strategies to deal with each of these sets of needs. Let the browser handle the file typing! What is the problem?

    Oh, that "creator" metadata that assumes you have several applications that handle a given file type, but you prefer to open a file with the application that created it, creates a knowledge gap between the application and the browser. Doesn't that break the assumption that a file type is a standard and that software should not assume one application is better than another?

    Oh, I forgot: Mac users have been promised they won't have to make any unnecessary choices (like which JPEG application to use for each file). This is an optional feature of a "Mac flavored" filesystem browser and should be implemented within the browser itself. How should (say in MacOS X) the filesystem browser get the creator metadata from the application? How about using RCS style metadata? Isn't creator information really historical data? Who should handle logging a file's history? Traditionally, that would be the application. Maybe the API should provide transparent historical metadata maintainance and hooks for the filesystem browser to access this history?

    First I have a system, and it works, but leaves me with too many choices. Next I want the system to have intuition so I can avoid making some choices. Now my once clear system is as complex and muddy as my own thinking as I digress and eventually undermine my working system's design.

    You see, eventually the historical metadata can become a neural network and anticipate data handling in more sophisticated ways without even the programmer having to make decisions about data handling...

    --
    --- Nothing clever here: move along now...
  110. Re:I HATE the MacOS and its stupid metadata! HATE by Anonymous Coward · · Score: 0

    >4/ File type were wrong since the start. Is HTML
    >a text file or a HTML file ?

    It's a text file, and that's what's so great about the MacOS. An example: I write some html-file using bbedit. It's got the correct html-ending, if I double-click it in the finder, it gets opened in bbedit, but I can still also edit it with any other Mac-program that knows text-files, even if they don't know about html, because they can *see* that it's a text-file. without me having to change the file extension. having type/creator-codes doesn't mean that the editor can't look at the file extension to figure out what kind of, say, text-file it is.

  111. Re:Ummm... MacOS X Does that, dude. by Anonymous Coward · · Score: 0

    Nautilus and Konqueror do it too, BTW.

  112. Don't be so hard on metadata by GPS+Pilot · · Score: 1
    It's pretty easy to change the file types/creator codes. I use Default Folder. Select a file in any open/save dialog, hit command-I, and a nice window comes up where you can edit file types/creator codes to your heart's content, along with view a bunch of other info about the file. Granted, that's not as easy as being able to change file type right there in the Finder. But I'm sure a little searching would turn up some utilities that allow you to change file type even more conveniently -- maybe even right there in the Finder. Point is, changing file type isn't so difficult as to justify your long rant.

    As for changing file names of a big batch of files, A Better Finder Rename does this admirably. Not sure if it can also do batch changes of file type -- I'd have to check.

    I love the fact that some of my JPGs open in PictureViewer, some of them open in GraphicConverter, and some of them open in AOL.

    double-clicking opens in WinAMP, right-clicking and choosing "edit" opens in SoundForge.

    That is indeed a nice feature. But it has nothing to do with the fact that the file type metadata is appended to the file name. Apple could choose to implement this type of functionality with file types/creator codes in their present location. (And I wish they would!)

    file extensions... make the type of file crystal-clear

    Umm, no. I use my Windows box more hours per day than my Mac, and I doubt I will ever memorize what all those three-letter extension signify. Sure, I know the common ones -- DOC, XLS, EXE -- but for 95% of them I have no idea.

    In contrast, icons on the Mac are usually so artfully done there's no question what app a document belongs to. Even various flavors of documents for a particular app (for example, the distinctive QuickTime icons for Mov, JPG, MPEG, MP3, etc.). And yes, Apple has human interface guidelines which tell developers how to make application icons quite distinct from document icons. Those guidelines are quite effective.

    --
    That that is is that that that that is not is not.
  113. AppleScript changes file/creator- put in OSA menu by Anonymous Coward · · Score: 0

    tell application "Finder"
    if document files in selection as list is {} then
    display dialog "No document files are selected." buttons {"OK"} default button "OK" with icon caution
    else
    set these_items to (document files in selection) as list
    repeat with this_item in these_items
    set the creator type of (every file of selection) to "TAR"
    set the file type of (every file of selection) to "TARF"
    end repeat
    end if
    end tell

  114. What a hoser. by Chasing+Amy · · Score: 2

    > Current shipping version of Windows (ME) does not allow that. You have to right click on the
    > file and get the properties dialog, because....the extensions are hidden.

    Bah. No one who knows anything about Windows uses ME. It's just Win98SE, but slower and bloated with more useless features. But even if you *are* using it, just click on the "View" pull-down on any open window, select "Folder Options," click on "View" tab, and un-check the box that says "Hide file extensions for known file types." Then the file extensions will always, always, appear for all files.

    Before you complain that that's complicated, you only have to do it once--and anyone with even the most basic skills knows that he needs to customze his preferences when setting up a box, whether Windows, Mac, or *nix. And you also should be aware that Apple has plans to do the same little preference about showing/hiding file extensions in the next OS X release. It's a way to make the extensions invisible to really dumb people who wouldn't know what to do with them and shouldn't be allowed to accidentally change them when they change a file's name, but still allow users with a modicum of experience (enough to un-check a box, which isn't much) to do their magic. And BTW, I did point out in my original comment that showing all file extensions is no longer the Windows default and must be changed by the user.

    Therefore, that "arcane" dialog box for changing extensions in WinME is unnecessary once you un-check a prefeence box.

    > Just not true. Netscape, IE, all every FTP app I can think of all reset the type and creator
    > on download

    If that's true, it's insane. Do you know what a CRC or CSV or SFV is? It's a small algorithmically-derived digest of a file, or of an entire set of files, which can be downloaded and automatically check the files to make sure none are corrupted. These are very commonly used to ensure file integrity, whenever there is a file or series of files of some importance which one wants to ensure are completely original and intact. Since type/creator is not in the resource fork, but in the file itself, altering them would alter the file in a very small and usually insignificant way, however the CRC/CSV/SFV would be rendered absolutely useless, since a change of even 1 bit would show up in the digest and therefore the file would be reported as corrupted. That is just insane, if true. My downloading software should IN NO WAY modify my files, without my explicit knowledge and permission. It renders file integrity checking utterly useless.

    The last version of Netscape I have ever used, on a Mac or otherwise, is 3.04Gold, so I would not know. I do recall that Netscape had a list of "helper applications" for what app should open what file--but that was only used if you selected within Netscape to "Open" the file upon download, not when just "saving" a file. I now use IE on Windows and Mozilla on Linux, though when Mozilla for Windows gets a bit better and faster, I plan to switch to that too. If what you said is true, though, it is absolutely stupid for the reason I mentioned above. You can't have file verifitcation if your OS midifies the file in any way whatsoever. File verification is especially important when downloading software; the software companies, particularly ones who make security software, often offer CRCs to their customers so that the customer can be confident that he has not received a trojaned or otherwise interfered-with software package.

    Second, you may not download much in StuffIt archives (who does)? But I download an awful lot in Zip and RAR archives. Doing that on a Mac would not affect type/creator of the files within the archive. In fact, you will find that Zip is the most commonly used method for transferring multiple files at once.

    --

    Chasing Amy
    (We all chase Amy...)
    "The more corrupt the state, the more numerous the laws"-Tacitus
  115. Re:I HATE the MacOS and its stupid metadata! HATE by jafac · · Score: 2

    First, you say we don't need no stinkin metadata, then you say we need it with more power and flexibility.

    I think we can all agree that application binding is a cool thing, and saves us a lot of work as an automatic shortcut to opening documents.

    The problem here is, and NOBODY has gotten this right so far as far as I'm concerned, is having a decent way for power users to edit and manipulate metadata, and configure the OS's treatment of it.

    Yes, the high and mighty programmers have access to it. The hackers have access to it. The grannies don't want or need access to it as long as application binding functions in a basic, and intelligent way. The power users have access to it, the same way the hackers do - but it's often a pain in the ass, using tools that weren't really designed to do anything other than mess around. Nothing useful can be done with these tools ON ANY OS, in terms of allowing a power user to quickly and easily manipulate the metadata to set up a custom behavior that suits his or her purposes.

    And that's really the whole problem.

    That, and of course the fact that filename extensions really have got to go.

    You'd think that the OS vendors would think about this, and provide the users with some nice tools. And I'll agree with you, Microsoft's solution is kind of nice. Where the CM for a Batch File in explorer will give you the choice to RUN the batch file, or Edit it. I think power users need that kind of flexibility for html as well, OPEN the file in a browser, or OPEN the file in an editor.
    I'm often frustrated with graphics files - sometimes I want to run a quick image viewer to display an image file - sometimes I want to tear it up in Photoshop. Launching Photoshop is a 60-second ordeal on some people's machines, and it's a necessary ordeal if you want to do serious editing.
    You see the problem here? Granny needs the file to open in her browser on a double-click. The power-user or content creator needs TWO choices on the execution of an icon. Edit or View. And in the case of executible content, Execute. This needs a user paradigm - probably a lot easier to use than a CM. And it must be MUCH quicker than the stupid "open this unregistered file in one of these " deal, which is annoying and slow on every OS I've seen it.

    when I think about how annoying this problem is, and how NOBODY has ever offered a real, workable solution for this. I see - an opportunity. . .

    --

    These are my friends, See how they glisten. See this one shine, how he smiles in the light.
  116. Still not quite right.. by Auckerman · · Score: 1
    "Since type/creator is not in the resource fork, but in the file itself, altering them would alter the file in a very small and usually insignificant way"


    Bzzt..Wrong. It's stored in the inode tree, not the file. The moment you copy a MacOS file to a flat file system, it looses this information permenantly. Now if you copy to a PC zip disk it APPEARS to keep it, but in reality, it stores the information on invisible files.


    "But I download an awful lot in Zip and RAR archives. Doing that on a Mac would not affect type/creator of the files within the archive"


    Bzzt...Wrong. Both zip and rar do NOT keep MacOS style meta data post compression. Not only that but Mac users almost exclusivly use Stuff it for compression. I know I do, because...drum roll please....it KEEPS the meta data post compression.

    --

    Burn Hollywood Burn
  117. re: malicious what? by kuma · · Score: 1

    what are you fucking talking about? malicious executable README displayed as what, a *text* file?

    no, the file will have an *application* icon. how many text files are distributed as applications?

    and even if the asshole who would try something like this uses a custom icon (and giving up a big clue, why distribute a macbinary text file?), he would be foiled by the power macos users, who do not always double-click documents.

    many of us drag files on top of the running app icon, because we need to deal with files using different applications.

    look, if it is so fucking easy to dupe mac users, why isn't there a vbs-outlook-microshit-virus insanity on macos? outside of office, viruses on macos are nearly extinct.

    to be clear, such files without special macintosh formatting to preserve file metadata/resources will open harmlessly as gibberish in a text editor... and if a trojan does sneak by a user, it will almost certainly be vulnerable to a quick force-quit, if not being shut down by virex or another utility.

  118. Then explain to me why... by Chasing+Amy · · Score: 1

    Then explain to me how this happened:

    Just recently, when I ran across some old Mac floppies, I read them and transferred the graphics files to my PC using MacOS 8.1 running on Basilisk II, then running HFV Explorer from within Windows I transferred all the graphics files from the Mac HFV file to my PC's regular FAT32 file system, then later transferred them to another HFV containing System 7. When I booted System 7 through Basilisk II the files still sometimes opened in JPEGview and sometimes in Graphic Converter. That sort of information should not have survived going from HFS to FAT32 back to HFS if what you say is the entire story. And HFV Explorer can't be responsible since if it were it would have chosen to give all the files of the same type the same type/creator codes.

    And the reason I was doing all these transfers, BTW, is that I'm creating a 200MB (or greater, as necessary) HFV file for use with 68k Mac emulators filled with Mac software and files commonly found on a Mac in c. 1994-1995, for historical puposes. Many of those files, and much of that old software, is now nearly impossible to find, so I am archiving as much of it as I can to be released in 2005 as a sort of abandonware homage to the Macs that were in use when the Internet made it mainstream. I find most of it in dusty old subdirectories of academic institution FTPs, which I fear will start cleaning house sooner or later and permanently losing a lot of those archaic files.

    At any rate, I have seen type/creator codes, particularly on graphics files, survive on FAT32 systems and on Zip archives I made ages ago.

    You may continue to argue all you want, but this has been my experience, the experience of someone who has both FAT32 filesystems, HFS filesystems, and others all on the same computer. What you are saying may or may not be correct, but it is still clearly not the whole picture since it contradicts my direct and extensive experience at transferring files across filesystems.

    --

    Chasing Amy
    (We all chase Amy...)
    "The more corrupt the state, the more numerous the laws"-Tacitus
  119. reiserfs plugins? by yestertech · · Score: 1

    If I remember correctly, the reiserfs system is designed to be extensible via plugin modules, and could possibly add this type of metadata and even file specific database stuff (like artist and preferred playback volume for an mp3) that could then be searchable via file utilities.

    am I correct?

    --
    there's no replacement for displacement
  120. Re:I HATE the MacOS and its stupid metadata! HATE by Vulture_ · · Score: 1
    $ chtype foo.html
    text/plain
    $ edit foo.html
    [ ... your text editor starts ... ]
    $ chtype foo.html text/html
    $ edit foo.html
    [ ... your HTML editor starts ... ]

    And there you have it. File typing and filesystem metadata for the UNIX world. Elegant, no? Just because existing filesystem metadata implementations are absurdly complex doesn't mean they have to be. Note also the use of MIME types, to further simplify and standardize the file types.

    Debian already does it sorta like this, only it uses file(1) magic to determine the MIME type, rather than using filesystem metadata. (Debian users: It's in the mime-support package.) Still, it provides view, edit, and other such commands, for performing various operations on files, regardless of their file type. It also can understand when (not) to run X applications to view/edit/whatever the file.

    Yet another fine example of the reasons why Debian is the One True Distribution.

    --

    The only way the typical /.er can pick up a chick is with a forklift. -- AC

  121. third dimension? by KurdtX · · Score: 2

    So would data about Metadata be... "Gitadata"?

    I know, but I just had to.

    --

    Kurdt
    I'm not anti-social. Just pro-technology.
  122. Re: malicious what? by castlan · · Score: 1

    Although I generally agree with your wiewpoint, I have to point out that I have seen README type executable files on a small but significant amount of shareware programs.

    The text file is usually distributed in an archive with the executable program, so that it can be hard to notice the custom icon.

    So in conclusion, the best way to operate on a Mac OS ( X) would be the standard power-user setup, list view with file details (Metadata) shown, and launch non execuatable documents by dragging them onto the applications. This allows maximum flexibility and control. Many Mac users use program launchers, floating palettes and toolbars containing their favorite apps for easy access. I usually just have aliases of my media viewing apps in a folder that can pop-up when I need it, then get out of site when I don't.

    BTW, for non Mac users, an alias is like a soft link to an inode. Thus it behaves like a soft link except that isn't broken when the original changes locations (like a hard link).

  123. Re: malicious what? by castlan · · Score: 1

    Although I generally agree with your wiewpoint, I have to point out that I have seen README type executable files on a small but significant amount of shareware programs.

    The text file is usually distributed in an archive with the executable program, so that it can be hard to notice the custom icon.

    So in conclusion, the best way to operate on Mac OS (not X) would be the standard power-user setup, list view with file details (Metadata) shown, and launch non execuatable documents by dragging them onto the applications. This allows maximum flexibility and control. Many Mac users use program launchers, floating palettes and toolbars containing their favorite apps for easy access. I usually just have aliases of my media viewing apps in a folder that can pop-up when I need it, then get out of site when I don't.

    BTW, for non Mac users, an alias is like a soft link to an inode. Thus it behaves like a soft link except that isn't broken when the original changes locations (like a hard link).