Should "B" be the Same as "b"?
joshua42 asks: "Although having used Linux and FreeBSD for many years, I have yet to come
across anyone seriously questioning the traditional UNIX style file system name paradigm. With an Amiga background (It should be the same for people growing up with Windows, or those growing up with no computer at all (God forbid!).) it took me quite a while to get used to 'A' and 'a' being treated as different characters. This is of course fairly easy to accept and to understand if you have a technical background. I do however
have a hard time to see how aunt Ginny will ever be able to distinguish between her 'Letter.txt', 'LETTER.TXT' and 'letter.txt' files. In real life, upper and lower case letters represents almost identical information to most people. Has any thoughts been spent on this issue, now that our
favorite OS is becoming increasingly mainstream? Does it need to be
addressed? Have any attempts been done? What are the implications to parts outside the file systems?" This is an interesting point. As Unix grows more and more popular, the simple things we've taken for granted about the filesystem may stand in the way of general users adopting it. What ways can you think of that will mitigate this problem for new Linux users without actually affecting too much? Special shells for novice users, that can simplify much of the complexity may be the way to go, here.
>Although having used Linux and FreeBSD for many years, I have yet to
>come across anyone seriously questioning the traditional UNIX style
>file system name paradigm.
>With an Amiga background (It should be the same for people growing up
>with Windows, or those growing up with no computer at all (God
>forbid!).) it took me quite a while to get used to 'A' and 'a' being
>treated as different characters. This is of course fairly easy to
>accept and to understand if you have a technical background.
>I do however have a hard time to see how aunt Ginny will
>ever be able to distinguish between her 'Letter.txt', 'LETTER.TXT' and
>'letter.txt' files.
Just like how aunt Ginny was likely somehow able to grasp that her
name is written aunt Ginny and not aunt gInNy, aunt gINNy, or other
combination. Give her a little credit. Simply explain that the case
is part of the file name. Your example Letter.txt file names would be
a perfect way to show her the difference. Just make each contain
different information, and open each one to show her they are
different.
File systems should be case sensitive. An upper case 'A' is a different
character than a lower case 'a'. We should not confuse people by
tricking them when the create file names.
>In real life, upper and lower case letters represents almost identical
>information to most people.
Almost, but not identical.
>Has any thoughts been spent on this issue, now that our favorite OS is
>becoming increasingly mainstream?
>
>Does it need to be addressed?
No.
>Have any attempts been done?
I hope not. Mount a case insensitive file system if you want one.
Leave existing file systems alone.
>What are the implications to parts
>outside the file systems?" This is an interesting point.
>As Unix
>grows more and more popular, the simple things we've taken for granted
>about the filesystem may stand in the way of general users adopting
>it.
The sooner people accept that 'Ginny' and 'gInNy' are not the same the
sooner they will understand how to interact with a computer.
>What ways can you think of that will mitigate this problem for new
>Linux users without actually affecting too much? Special shells for
>novice users, that can simplify much of the complexity may be the way
>to go, here.
How about a mouse-click'n GUI like GNOME, KDE, etc.
The problem is more complicated than the question makes it out to be. An Ideal filesystem should allow any random binary bits to make up a filename, such that the filenames can be Unicode, so that Chinese people can name files in Chinese, Math professors can use the unicode for a math formula as the name of a document describing how to solve it. When you think in this bigger sense - it becomes a lot harder.
Ideally the encoding method (Unicode in this example) should provide some way of seeing the equivalency of certain characters (two different representations of the equal sign, two different cases of the letter A, etc..), and the application should be able to make use of this during a regex search, or maybe even during a library wrapped "open() or readdir()" call, where the application is "Windows Explorer", "bash", or anything else.
Ultimately this has to be resolved in userland tools and the libraries that support them - the best answer for the filesystem layer is to support all possible characters literally and meaningfully in filenames, so as not to restrict the schemes layered on top of it.
11*43+456^2
iF yOU wROTE a lETTER tO yOUR aUNT gINNY lIKE tHIS wOULD sHE nOTICE sOMETHING wRONG wITH iT?
If you think she would, then she can grasp the concept that case makes a difference. Give her a little credit.
If tits were wings it'd be flying around.
... and now that I got your attention, let me specify that all other file systems as well are obsolete in the context of the USER INTERFACE.
Frankly, aunt Ginny should *never* have to deal with files and file names. She should not need to know what a file is, nor choose to "save" or "discard" her work after she has written the letter to her friend Margaret. She does not know her HD from her RAM, and all for the better. She would worry to death over having her letter spun around on a magnetic disc, it would get all jumbled up for sure!
File system is an internal, abstract and archaic database that is familiar to programmers and geeks, but a lousy way to represent data for the general user. There are few things worse than navigating a blind hierarchy of unknown folders with no contextual guide to help.
The system should remember the letter when it is written, keep tabs on when it was written, put the subject in a "recent letters" list and generally manage the internal filing transparent to the user. The storage capacity of a modern computer can last aunt Ginny for years, the real trouble is in FINDING her data, the file names alone do little good for that.
For a wonderful example of how well you could do without a filesystem, look at the operation of the Palm OS devices. Anyone could learn to use them. No files in sight! It's only recently that the clever engineers at Palm jumped off the deep end by adding a file system for the flash carts. Anyone who has ever used those knows what a nightmare managing them is.
Aunt Ginny knows fsck all about file systems. Lets keep it that way.
(Oh, and the answer in the context of user interfaces? Go for the most HUMAN representation. People are not very sensitive at all to upper/lowercase letters. We should not punish them for this.)
Jouni
Jouni Mannonen | Game Designer, Consultant