TermKit looks incredibly stupid, it's the same koolaid as "powershell" where the main program needed is "serialize" so that you can convert your "objects" into text and actually get some work done.
JSON fortunatly *is* text so at least they won't do that (though there is going to have to be some way to strip the "JSON-ness" from it so that the piped program treats it as text). But since it is text the existing pipes can send it! Just choose the right program.
The developer of TermKit seems infatuated by the idea that "cat foo.png" should display the picture. No it shouldn't, it should be sending the bytes in the picture to stdout. Maybe the terminal on the end can recognize it is a.png and display if you really think that is kool. But that won't work if "cat" is required to recognize it and wrap it in JSON.
Hey better yet, why not just have the command "foo.png" display the picture. This is done by looking up the application needed for a file and running it. This amazing ability has been done by GUI desktops for 20 years now. I know people may find it hard to believe, but "look up the application needed for this file" is not linked inseperably to mouse clicks. By thinking REALLY REALLY HARD, I bet you can figure out how to program it to happen without a mouse click! Of course this has apparently eluded Linux and WIndows and OSX and every other programmer for decades so maybe it is not as bloody obvious as I think it is...
That always sounded really stupid to me. Oh, no, we can't have the subway stop here, the gang bangers might use it to get here and do drive-by shootings!
Of course they were designed to evolve toward sharing, and also toward not sharing. The machines were quite simple and were not going to produce any other evolved program. What you are claiming is that the robots got zero benefit by not sharing but this is not true. Many of their simulations (the ones where the population was random and not related) quickly went toward all robots not sharing. So your understanding is wrong.
What they showed was that the tendency toward evolving sharing exactly matched how similar the robots were, showing that it was directly related to saving similar genes. If all the robots were random variations of previous successful versions they all ended up not sharing. If all the robots were identical copies of the most successful one from the previous generation then they all ended up sharing.
I think you are misunderstanding what I am saying. The application CANNOT prevent other applications from raising their windows. That is the whole point! There is no such thing as a "modal dialog box" so an application cannot create such a thing. I think security could be a LOT better than now with these simplifications.
What an application can do is raise it's OWN windows. More importantly, they ARE NOT RAISED BY ANYTHING ELSE. This means there is LESS raising, not more like you seem to think.
I believe it is a deal-breaker. It makes overlapping windows useless, forcing all applications to go to a single-window model with tiled contents. Eliminating an entire quite useful possibility for interaction is a big problem, I think. Note that Linux is not much better, only thorough a few fortunate well-designed older window managers did this work (and even they were broken when the user clicked on the window borders), but newer ones are increasing impossible. And as Gimp proves, if you want to be portable you might as well give up even if Linux works perfectly, as your design will never work on Windows.
What really annoys me is that it is so trivially solved: the system should not raise (or lower or otherwise arrange, or map or unmap) a window EVER. The application must do it in response to events. Then the application would have complete control and can keep the windows in what ever order it wants (and change this order at any time it wants), and this would probably take about 1/10 as much code as is being devoted to window manager hints and window classes now. If you are paranoid about back-compatibility then the app can turn on a flag to tell the wm to not do it's own thing.
That may be new. No possible combination of event filters and a zillion other things would allow a click in a window to not raise it for me, except for the ability to repliate the "hung app" behavior where the window did not raise, but also did not respond to any mouse clicks or keystrokes.
Making the window topmost is useless, because the user cannot put *other* windows on top of it.
The desired behavior which is as far as I can tell impossible in Windows is:
Windows A and B and C. C is the "child". C is atop A and B at all times (that is a requirement). It is not a requirement but it is acceptable if clicking in A puts them in the order C,A,B while clicking in B puts them in the order C,B,A. In addition clicking anywhere else other than A and B should act normal, C should not be any kind of topmost or otherwise "funny" window and should be indistinguisable from a child of the topmost of A and B.
If they are going to use 16 bits they should use IEEE floating point half. What I think you are suggesting is long-obsolete technology and would be a step backwards.
Being able to arrange the layer names in a hierarchy is trivial and if that was what was wanted I'm sure Gimp would have had this years ago.
You seem to be missing the real point of layer groups, which is that unless all the operations done by the layers are identical, you need to be able to group them to get usable combinations. "(A times B) over C" is different than "A times (B over C)" and cannot be achieved unless you can group A and B together
The problem on both Windows and Linux is that the only useful thing you can do with a window to fix these is to make it a "floating child window" (in Windows's terms) or a "transient for" in X terms. Unfortunately this only keeps the window atop a *single* window, not a set, making it not work correctly for Gimp editing more than one image.
Linux/X had a few features so Gimp tried to what it could to work around this. For a long time they made the windows "normal" and relied on the fact that X window mangers can usually be set to not "click to top" so when the user clicks in an image it does not raise over the tools. You cannot do this on Windows however, and newer Linux window managers are making this setting increasingly impossible. Also as you saw, the OS thinks you have a whole lot of documents opened, as Linux started copying the taskbar stuff, and they also started copying the bloated window decorations off Windows as well, so you only get "thin" window borders if you mark it as a child window.
More recent versions do set the child window indicator, and attempt to change it as the user clicks in different images to raise them, thus trying to keep them atop the top-most opened image. I think this was pretty buggy because the window manager was not designed to do this. The equivalent on Windows is impossible except by destroying and recreating the window which would make them blink. Even more recent they finally fixed the "window group" in X window managers so you could keep a window atop any of a set, still this is pretty unreliable.
As Linux kept copying (mis)features from Windows this gradually got worse and it looks like the Gimp guys have decided they have to fix it for Linux in the same way most programs do (ie make only one window, which is the only way to actually control the order your windows are in). This will have the fortunate side-effect of fixing it for Windows as well.
However it is unfortunate that bad systems are preventing actual ideas in ui design from being tried. It is also unfortunate that the current systems are enormously complex compared to a much more powerful and useful system. Just in case you cannot figure it out, here is how it should work:
1. A program can directly set flags on windows to say whether they appear in the task bar and what kind of decorations. They are appearance only and have no behavior changes!
2. No window ever ever ever raises, lowers, appears, or disappears except by a call from the application.
3. Add a non-blinking api to place a whole set of windows in a desired stacking order and visibility.
3. Applications are fixed so that they respond to clicks by raising/showing/hiding whatever windows they want, irregardless of the flags or anything the OS thinks.
They "work on Windows" by using a single window and tiling it. This is Window's fault because they made it impossible to stop a click from raising a window.
Of course the idiots desiging both KDE and Gnome are copying this wonderful feature slavishly from Windows, leading to every program including Gimp being forced to a single-window design as well.
Mac is just messed up (clicking raises windows there, too) but they have about 100 "window modes" that can be used to keep floating windows on top.
Hints to designers of apis: THE PROGRAM CAN RAISE ITSELF IF IT WANTS!!!!! It is not rocket science and all systems already have a "raise this window" api call so there is not even any need to change the api. And this would get rid of all the need for "child windows" and "modal windows" and "stay on top" windows and the dozens to hundreds of "window modes" all of which are variations on "try to stop other windows from raising atop this one".
Imagine if OS X did not reject invalid surrogate pairs for filenames. What would happen, then, if you'd have a file with such an invalid pair, and then did a readdir() on it - what would you expect to see in d_name after it converts to UTF-8?
I would expect to see the unpaired surrogate halves each converted to the correct 3-byte UTF-8 encoding for that code point.
This really isn't that hard. Yes 8-bit systems are a mess of legacy encodings, so my recommendation is that the problem be ignored as much as possible, moved as close to final interpretation on the display as you can. All other solutions just perpetuate these problems by making it impossible to change them to UTF-8. And as you are well aware, any text that is not UTF-8 is going to display wrong in a huge number of programs no matter how you set the locale.
16-bit systems do not have this encoding mess, as they are virtually 100% UTF-16 or UCS-2 which have identical and obvious and lossless conversions to UTF-8, therefore there is no reason for the filesystem to not do this.
Your example of Joliet on the legacy Russian locale: the correct result is that the 16-bit filenames on the disk are converted to UTF-8 by the filesystem driver. If the program that prints on the terminal is aware that the terminal is going to interpret these as 1-byte russian encoding, then it is that program's responsibility to convert the names again, although it is likely nobody is going to complain much if it does not do so and the terminal produces mojibake. 90% of the programs those filenames are going to will produce the correct output, because they only understand UTF-8. Browsers, for instance.
Now if the filesystem has 8-bit filenames stored in that Russian locale, the filesystem should return them unchanged. It can't do any better, it really should have no idea of any "locale" and I don't think it should attempt to convert to UTF-8. The reason it is safe to always convert 16-bit filenames is because currently 100% of them are in UTF-16 (or UCS-2 which only differs in unassigned Unicode points anyway and can be converted with the same code as UTF-16) and UTF-16 to UTF-8 conversion is lossless even for invalid UTF-16 so there is no problem with the "wrong locale" being used. That is not true of 8-bit encodings, they can be in any number of locales and the raw bytes are much more useful than the incorrect recoding. It is true that these strings will display "correctly" if the terminal happens to be using the same encoding, but this is just obsolete implementations that should be fixed. Due to the design of UTF-8, those Russian names are almost 100% likely to be invalid UTF-8, so a correct program could check for this, and if it looks like invalid UTF-8 (and ONLY then) it can use some legacy code to figure out what encoding it probably is and print it. Or, more likely, it can just print it as UTF-8 and produce the *same* mojibake that 90% of the software in the world will produce, which would be an awful lot more use friendly than the current behavior, and would quickly get everybody switched to UTF-8 and rid us of "locale" forever.
On Windows what I use for all filesystem access (in fltk) is a converter from UTF-8 to UTF-16 that converts all error bytes as though they are in the CP1252 encoding. This is lossy but not that bad as I already have to deal with "case independence" so conversion to a filename is lossy anyway. The reverse conversion from UTF-16 to UTF-8 is lossless (unpaired surrogates turn into the obvious UTF-8 encoding) so any result of readdir will work, have all the filenames different, and those filenames can be used to open the files. The problem is that these functions are not "fopen()" and so on, but of course have different names, as they are not part of libc. This greatly limits usability of them as I cannot deal with any libraries that open files on their own.
For Windows to do POSIX correctly it must implement this at a low level so all the POSIX calls and everything that calls them goes through the same code. I think the easiest way is to change the "mb" (and probably the "a") apis to be forced to use this conversion, as this useful feature would be available to Windows software too. Changing just the POSIX api to do this means the POSIX programs could not call Windows libraries that don't use this when opening files which would make portability much harder. It would also help considerably for access to remote filesystems that use 8-bit filenames if there was a "tunnel" that sent the 8-bit text unchanged, this is because the conversion UTF-8 to UTF-16 and back is lossy. I can understand Microsoft balking at that idea, and it probably is livable without it.
You are wrong. You can do Unicode by making all the api's be UTF-8. There is no need for UTF-16 (which is just another variable length encoding, have you heard of non-BMP characters?).
Absolutely the correct thing for a Linux NTFS driver to do is to convert the filenames to UTF-8. Yes if they echo on a terminal that is translating byte streams to glyphs using the rules of KO18-R or whatever, they will display as mojibake. That does not mean the filenames are bad, in fact if the user carefully types the exact same mojibake in they will name the file, proving that it is in fact the name of the file.
What does break is lossy conversion. You CANNOT translate NTFS filenames to any non-Unicode encoding, because you will lose characters that were in them and therefore it is impossible to name all the possible files!
Another lossy conversion is to attempt to convert UTF-8 to UTF-16. You cannot do this (at least not while preserving the conversions done by most programs from UTF-16 to UTF-8). Therefore you certainly do NOT want to make a "wchar" api in POSIX, it would be VERY VERY VERY BAD!!!! OS/X tries to fix this by rejecting invalid byte encodings for filenames so there cannot be a file that has a name that cannot pass through their 16-bit api, but I consider this a rather stupid movement of code that should be very high level down into the driver. As there are lots of filesystems that allow an arbitrary byte stream to be a filename, you cannot use any "wchar" api for these.
Actually you can see the problem in Windows when it attempts to show remote Unix filesystems using NFS. Instead of displaying the filenames as UTF-8, it displays them as ISO-8859-1. This is the typical broken response that I have seen programmers resort to when they are finally hit by the clue-by-4 and realize that there is no magical part of the computer that forces bytes to always be valid UTF-8. Microsoft is not doing too bad, I certainly have seen ones that discard *all* high bit bytes. Yuck.
Does OS X really do that? It seems to be using UTF-8 locale by default; are you sure it's not just that, and if you change it, it won't change filename treatment accordingly?
Yes the locale for libc is set to UTF-8 on OSX. It is on Linux as well.
If so, then it's broken in a different way, namely: you can't take an output of any locale-aware ANSI C or POSIX function, and pass it to fopen() and friends expecting meaningful results.
You seem to have a weird idea of what is a "meaningful result". For me I expect that the exact same array of bytes will open the exact same file every single time!
Anyway I have OSX here to check, it appears to work the way I expect. I made filenames with 2, 3, and 4 byte UTF-8 characters in them (mostly by cutting and pasting from web pages into the terminal). They list correctly, but when I set the locale to "C" they then display 2,3, or 4 '?' characters (not one if there was some kind of locale-aware conversion from UTF-16). I wasted some time trying to get them to display as ISO-8859-1 bytes, but it appears Terminal only understands if the locale is UTF-8, all others it prints all high-bit bytes as '?'. Feeding the output of ls to od also shows that the exact same bytes occur no matter what the locale is. I did some experiments in tcl as well, all showing that the filenames report the same number of bytes, the number expected for UTF-8, no matter what the locale is. So far I have been pretty happy with OSX, they seem to have their brains together and don't assign magical properties to strings, considerably better than the morons writing a lot of Linux and Windows.
Feeding an invalid UTF-8 string to the api should result in exactly the same sort of result as feeding an invalid UTF-16 string to the Windows wchar api (a UTF-16 string can easily be invalid by having an unmatched surrogate half).
I am reasonably certain that Windows will go and create the file despite the invalid UTF-16. In fact they never test for this, because they are well aware that it would be a total pain in the ass for anybody programming to have to make sure an array contains some magical pattern before it can be used as a filename.
However these very same people will often then think that UTF-8 must be treated special. They will literally say an array is "not UTF-8" because it has invalid sequences in it (you did above, for instance). This is like saying an newspaper is not in English if there is a mispelled word in it. Technically true but pretty useless and it is far better to treat it as English than to define some new language for it. An easy way to see if they are either hypocrites or (more likely) the idiot savants is to see what the same programmers do about invalid UTF-16. They will often act completely different, proving their total illogical stance.
Due to the unfortunate use of 16-bit strings in a lot of apis, the best solution so far has been to preserve UTF-8 as long as possible, then do the lossy conversion by taking each error byte (which will always have the high bit set) and encode at as a high-order surrogate half character in the range 0xDC80-0xDCFF. This is therefore invalid UTF-16 so on the off chance the system actually produces an "invalid encoding" error, they will get one for UTF-8 as well. I think Windows does not do this, so it will instead create a file.
This whole thing is far from perfect but we have to live with a bunch of politically-correct idiots who thought wchar was the solution.
Then the file system driver translates the 16 bit filenames to an 8 bit encoding. At no point does the kernel or runtime api see anything other than 8-bit encodings.
I strongly suspect that there are equally idiotic people writing these drivers as it sounds like there are at Microsoft, and the translation may depend on the "locale" rather than being forced to UTF-8. In that case I suspect Linux is going to be nearly useless reading these disks until they fix that, for the same reason this SFU stuff is going to be useless on Windows.
This is not a limitation of the compiler. You cannot set UTF-8 with GCC either. The actual system API refuses.
The LC_ALL and similar environment variables should not effect how filenames are stored on disk. That would be incompatible with POSIX. Although I think it would be best to ignore them entirely, if you wanted to make the POSIX subsystem obey them they must limit themselves to changing printf and scanf functions.
If POSIX "open()" maps the the "mb" api in WIndows then the Windows "locale" MUST be set to UTF-8 with a mode such that errors turn unambigoulsly into UTF-16 errors without throwing exceptions (it is not necessary that these files actually open, only that they return the same type of error that other invalid characters do). Any other situation makes this POSIX api useless.
Wrong. Unix returns the 8-bit data UNCHANGED from the disk. The locale does not alter what filenames are seen.
for the likes of readdir and other multibyte (rather than widechar) functions, it uses the current locale to encode filenames.
Okay then the answer is NO. It is not possible to name every file. Therefore this SFU is useless.
If your system locale is set to UTF-8, I believe it will feed you UTF-8 (mine is set to CP1251, and for some mysterious reason it requires a reboot to change, so I can't check this for sure - sorry).
You have now identified the obvious solution. However from everything I have tried, it appears to be impossible for a program to force the API to UTF-8. You have to set a registry or something and start a new login. The fact that this is impossible leads me to believe that Microsoft is still actively trying to prevent compatibility. It is also likely that "UTF-8" locale will throw exceptions, crash, or do other strange stuff when the UTF-8 byte stream contains errors, this makes it extremely unreliable.
OSX uses an 8-bit api to pass filenames to the kernel and thus to the filesystem drivers.
It is quite possible the filesystem does UTF-8 to UTF-16 conversion when looking up files. That is irrelevant, except for the fact that this same feature is needed for Windows to correctly support POSIX.
It does appear Windows is going to continue to be broken, as you say it uses the "current locale" to translate the filenames from POSIX into the ones on the disk. This is unlike OSX and also totally incompatible with any 8-bit filesystems and with Unix. Unix relies on a given string identifying the same file no matter what the "locale" is.
This is trivial to fix by forcing the "locale" to be UTF-8. The fact that Microsoft does not do this, and seems to distinctly avoid providing an interface so a program can decide to do so (I don't think it requires a reboot as you say, but it does need a new shell created), indicates to me that they are still, on purpose, making sure portability is impossible.
I think the general complaint is that the "ribbon" really isn't new. There have been "tabs" for a long, long time, including in Microsoft software. And some toolkits (not sure about MFC) consider the "menubar" to be a normal widget and therefore can be put inside a "tab". Furthermore many toolkits have considered a "menubar" to not only contain "submenus" but also "buttons" (often as a submenu title with no children). I certainly did this in a toolkit that is now almost 20 years old (see fltk).
Like a lot of Microsoft stuff, a lot of the angst is that Microsoft pretends they have come up with something *new* when in fact it can be a decades old idea. Also that they absolutely refuse to use any existing term for that idea, in order to defeat searches for prior art (ie the term "ribbon" rather than "tabs" which is certainly the keyword all previous versions used).
Conversely I think the anti-Microsoft crowd is to blame. They sit on their duff, scared to death of being "hard to use" or the dreaded "inconsistent" and thus refusing to actually incorporate any of these new ideas into actual end user products. So nobody actually sees these except for the programmers at Microsoft who examine the code and decide what to steal. And then when Microsoft comes up with it, they try to project their own fears onto them and try to say "oh Microsoft is being inconsistent and hard to use!!!"
TermKit looks incredibly stupid, it's the same koolaid as "powershell" where the main program needed is "serialize" so that you can convert your "objects" into text and actually get some work done.
JSON fortunatly *is* text so at least they won't do that (though there is going to have to be some way to strip the "JSON-ness" from it so that the piped program treats it as text). But since it is text the existing pipes can send it! Just choose the right program.
The developer of TermKit seems infatuated by the idea that "cat foo.png" should display the picture. No it shouldn't, it should be sending the bytes in the picture to stdout. Maybe the terminal on the end can recognize it is a .png and display if you really think that is kool. But that won't work if "cat" is required to recognize it and wrap it in JSON.
Hey better yet, why not just have the command "foo.png" display the picture. This is done by looking up the application needed for a file and running it. This amazing ability has been done by GUI desktops for 20 years now. I know people may find it hard to believe, but "look up the application needed for this file" is not linked inseperably to mouse clicks. By thinking REALLY REALLY HARD, I bet you can figure out how to program it to happen without a mouse click! Of course this has apparently eluded Linux and WIndows and OSX and every other programmer for decades so maybe it is not as bloody obvious as I think it is...
Even though I certainly did not like him, I think you really insult W by comparing to Palin. They are worlds apart.
Why can't you drive your car to the train station?
That always sounded really stupid to me. Oh, no, we can't have the subway stop here, the gang bangers might use it to get here and do drive-by shootings!
Hint: those people you fear have cars.
Of course they were designed to evolve toward sharing, and also toward not sharing. The machines were quite simple and were not going to produce any other evolved program. What you are claiming is that the robots got zero benefit by not sharing but this is not true. Many of their simulations (the ones where the population was random and not related) quickly went toward all robots not sharing. So your understanding is wrong.
What they showed was that the tendency toward evolving sharing exactly matched how similar the robots were, showing that it was directly related to saving similar genes. If all the robots were random variations of previous successful versions they all ended up not sharing. If all the robots were identical copies of the most successful one from the previous generation then they all ended up sharing.
I think you are misunderstanding what I am saying. The application CANNOT prevent other applications from raising their windows. That is the whole point! There is no such thing as a "modal dialog box" so an application cannot create such a thing. I think security could be a LOT better than now with these simplifications.
What an application can do is raise it's OWN windows. More importantly, they ARE NOT RAISED BY ANYTHING ELSE. This means there is LESS raising, not more like you seem to think.
I believe it is a deal-breaker. It makes overlapping windows useless, forcing all applications to go to a single-window model with tiled contents. Eliminating an entire quite useful possibility for interaction is a big problem, I think. Note that Linux is not much better, only thorough a few fortunate well-designed older window managers did this work (and even they were broken when the user clicked on the window borders), but newer ones are increasing impossible. And as Gimp proves, if you want to be portable you might as well give up even if Linux works perfectly, as your design will never work on Windows.
What really annoys me is that it is so trivially solved: the system should not raise (or lower or otherwise arrange, or map or unmap) a window EVER. The application must do it in response to events. Then the application would have complete control and can keep the windows in what ever order it wants (and change this order at any time it wants), and this would probably take about 1/10 as much code as is being devoted to window manager hints and window classes now. If you are paranoid about back-compatibility then the app can turn on a flag to tell the wm to not do it's own thing.
That may be new. No possible combination of event filters and a zillion other things would allow a click in a window to not raise it for me, except for the ability to repliate the "hung app" behavior where the window did not raise, but also did not respond to any mouse clicks or keystrokes.
Making the window topmost is useless, because the user cannot put *other* windows on top of it.
The desired behavior which is as far as I can tell impossible in Windows is:
Windows A and B and C. C is the "child". C is atop A and B at all times (that is a requirement). It is not a requirement but it is acceptable if clicking in A puts them in the order C,A,B while clicking in B puts them in the order C,B,A. In addition clicking anywhere else other than A and B should act normal, C should not be any kind of topmost or otherwise "funny" window and should be indistinguisable from a child of the topmost of A and B.
If they are going to use 16 bits they should use IEEE floating point half. What I think you are suggesting is long-obsolete technology and would be a step backwards.
Being able to arrange the layer names in a hierarchy is trivial and if that was what was wanted I'm sure Gimp would have had this years ago.
You seem to be missing the real point of layer groups, which is that unless all the operations done by the layers are identical, you need to be able to group them to get usable combinations. "(A times B) over C" is different than "A times (B over C)" and cannot be achieved unless you can group A and B together
Are you sure about #2? I find you can raise them.
The problem on both Windows and Linux is that the only useful thing you can do with a window to fix these is to make it a "floating child window" (in Windows's terms) or a "transient for" in X terms. Unfortunately this only keeps the window atop a *single* window, not a set, making it not work correctly for Gimp editing more than one image.
Linux/X had a few features so Gimp tried to what it could to work around this. For a long time they made the windows "normal" and relied on the fact that X window mangers can usually be set to not "click to top" so when the user clicks in an image it does not raise over the tools. You cannot do this on Windows however, and newer Linux window managers are making this setting increasingly impossible. Also as you saw, the OS thinks you have a whole lot of documents opened, as Linux started copying the taskbar stuff, and they also started copying the bloated window decorations off Windows as well, so you only get "thin" window borders if you mark it as a child window.
More recent versions do set the child window indicator, and attempt to change it as the user clicks in different images to raise them, thus trying to keep them atop the top-most opened image. I think this was pretty buggy because the window manager was not designed to do this. The equivalent on Windows is impossible except by destroying and recreating the window which would make them blink. Even more recent they finally fixed the "window group" in X window managers so you could keep a window atop any of a set, still this is pretty unreliable.
As Linux kept copying (mis)features from Windows this gradually got worse and it looks like the Gimp guys have decided they have to fix it for Linux in the same way most programs do (ie make only one window, which is the only way to actually control the order your windows are in). This will have the fortunate side-effect of fixing it for Windows as well.
However it is unfortunate that bad systems are preventing actual ideas in ui design from being tried. It is also unfortunate that the current systems are enormously complex compared to a much more powerful and useful system. Just in case you cannot figure it out, here is how it should work:
1. A program can directly set flags on windows to say whether they appear in the task bar and what kind of decorations. They are appearance only and have no behavior changes!
2. No window ever ever ever raises, lowers, appears, or disappears except by a call from the application.
3. Add a non-blinking api to place a whole set of windows in a desired stacking order and visibility.
3. Applications are fixed so that they respond to clicks by raising/showing/hiding whatever windows they want, irregardless of the flags or anything the OS thinks.
the fact that it's impossible to have a window in-focus without raising it
DING DING DING! We have a winner! Despite your attempt to put disdain on the claim that Windows is broken, you managed to exactly state the problem.
They "work on Windows" by using a single window and tiling it. This is Window's fault because they made it impossible to stop a click from raising a window.
Of course the idiots desiging both KDE and Gnome are copying this wonderful feature slavishly from Windows, leading to every program including Gimp being forced to a single-window design as well.
Mac is just messed up (clicking raises windows there, too) but they have about 100 "window modes" that can be used to keep floating windows on top.
Hints to designers of apis: THE PROGRAM CAN RAISE ITSELF IF IT WANTS!!!!! It is not rocket science and all systems already have a "raise this window" api call so there is not even any need to change the api. And this would get rid of all the need for "child windows" and "modal windows" and "stay on top" windows and the dozens to hundreds of "window modes" all of which are variations on "try to stop other windows from raising atop this one".
The LA museum of Discovery is not free, either.
Imagine if OS X did not reject invalid surrogate pairs for filenames. What would happen, then, if you'd have a file with such an invalid pair, and then did a readdir() on it - what would you expect to see in d_name after it converts to UTF-8?
I would expect to see the unpaired surrogate halves each converted to the correct 3-byte UTF-8 encoding for that code point.
This really isn't that hard. Yes 8-bit systems are a mess of legacy encodings, so my recommendation is that the problem be ignored as much as possible, moved as close to final interpretation on the display as you can. All other solutions just perpetuate these problems by making it impossible to change them to UTF-8. And as you are well aware, any text that is not UTF-8 is going to display wrong in a huge number of programs no matter how you set the locale.
16-bit systems do not have this encoding mess, as they are virtually 100% UTF-16 or UCS-2 which have identical and obvious and lossless conversions to UTF-8, therefore there is no reason for the filesystem to not do this.
I think you are starting to understand.
Your example of Joliet on the legacy Russian locale: the correct result is that the 16-bit filenames on the disk are converted to UTF-8 by the filesystem driver. If the program that prints on the terminal is aware that the terminal is going to interpret these as 1-byte russian encoding, then it is that program's responsibility to convert the names again, although it is likely nobody is going to complain much if it does not do so and the terminal produces mojibake. 90% of the programs those filenames are going to will produce the correct output, because they only understand UTF-8. Browsers, for instance.
Now if the filesystem has 8-bit filenames stored in that Russian locale, the filesystem should return them unchanged. It can't do any better, it really should have no idea of any "locale" and I don't think it should attempt to convert to UTF-8. The reason it is safe to always convert 16-bit filenames is because currently 100% of them are in UTF-16 (or UCS-2 which only differs in unassigned Unicode points anyway and can be converted with the same code as UTF-16) and UTF-16 to UTF-8 conversion is lossless even for invalid UTF-16 so there is no problem with the "wrong locale" being used. That is not true of 8-bit encodings, they can be in any number of locales and the raw bytes are much more useful than the incorrect recoding. It is true that these strings will display "correctly" if the terminal happens to be using the same encoding, but this is just obsolete implementations that should be fixed. Due to the design of UTF-8, those Russian names are almost 100% likely to be invalid UTF-8, so a correct program could check for this, and if it looks like invalid UTF-8 (and ONLY then) it can use some legacy code to figure out what encoding it probably is and print it. Or, more likely, it can just print it as UTF-8 and produce the *same* mojibake that 90% of the software in the world will produce, which would be an awful lot more use friendly than the current behavior, and would quickly get everybody switched to UTF-8 and rid us of "locale" forever.
On Windows what I use for all filesystem access (in fltk) is a converter from UTF-8 to UTF-16 that converts all error bytes as though they are in the CP1252 encoding. This is lossy but not that bad as I already have to deal with "case independence" so conversion to a filename is lossy anyway. The reverse conversion from UTF-16 to UTF-8 is lossless (unpaired surrogates turn into the obvious UTF-8 encoding) so any result of readdir will work, have all the filenames different, and those filenames can be used to open the files. The problem is that these functions are not "fopen()" and so on, but of course have different names, as they are not part of libc. This greatly limits usability of them as I cannot deal with any libraries that open files on their own.
For Windows to do POSIX correctly it must implement this at a low level so all the POSIX calls and everything that calls them goes through the same code. I think the easiest way is to change the "mb" (and probably the "a") apis to be forced to use this conversion, as this useful feature would be available to Windows software too. Changing just the POSIX api to do this means the POSIX programs could not call Windows libraries that don't use this when opening files which would make portability much harder. It would also help considerably for access to remote filesystems that use 8-bit filenames if there was a "tunnel" that sent the 8-bit text unchanged, this is because the conversion UTF-8 to UTF-16 and back is lossy. I can understand Microsoft balking at that idea, and it probably is livable without it.
You are wrong. You can do Unicode by making all the api's be UTF-8. There is no need for UTF-16 (which is just another variable length encoding, have you heard of non-BMP characters?).
Absolutely the correct thing for a Linux NTFS driver to do is to convert the filenames to UTF-8. Yes if they echo on a terminal that is translating byte streams to glyphs using the rules of KO18-R or whatever, they will display as mojibake. That does not mean the filenames are bad, in fact if the user carefully types the exact same mojibake in they will name the file, proving that it is in fact the name of the file.
What does break is lossy conversion. You CANNOT translate NTFS filenames to any non-Unicode encoding, because you will lose characters that were in them and therefore it is impossible to name all the possible files!
Another lossy conversion is to attempt to convert UTF-8 to UTF-16. You cannot do this (at least not while preserving the conversions done by most programs from UTF-16 to UTF-8). Therefore you certainly do NOT want to make a "wchar" api in POSIX, it would be VERY VERY VERY BAD!!!! OS/X tries to fix this by rejecting invalid byte encodings for filenames so there cannot be a file that has a name that cannot pass through their 16-bit api, but I consider this a rather stupid movement of code that should be very high level down into the driver. As there are lots of filesystems that allow an arbitrary byte stream to be a filename, you cannot use any "wchar" api for these.
Actually you can see the problem in Windows when it attempts to show remote Unix filesystems using NFS. Instead of displaying the filenames as UTF-8, it displays them as ISO-8859-1. This is the typical broken response that I have seen programmers resort to when they are finally hit by the clue-by-4 and realize that there is no magical part of the computer that forces bytes to always be valid UTF-8. Microsoft is not doing too bad, I certainly have seen ones that discard *all* high bit bytes. Yuck.
Does OS X really do that? It seems to be using UTF-8 locale by default; are you sure it's not just that, and if you change it, it won't change filename treatment accordingly?
Yes the locale for libc is set to UTF-8 on OSX. It is on Linux as well.
If so, then it's broken in a different way, namely: you can't take an output of any locale-aware ANSI C or POSIX function, and pass it to fopen() and friends expecting meaningful results.
You seem to have a weird idea of what is a "meaningful result". For me I expect that the exact same array of bytes will open the exact same file every single time!
Anyway I have OSX here to check, it appears to work the way I expect. I made filenames with 2, 3, and 4 byte UTF-8 characters in them (mostly by cutting and pasting from web pages into the terminal). They list correctly, but when I set the locale to "C" they then display 2,3, or 4 '?' characters (not one if there was some kind of locale-aware conversion from UTF-16). I wasted some time trying to get them to display as ISO-8859-1 bytes, but it appears Terminal only understands if the locale is UTF-8, all others it prints all high-bit bytes as '?'. Feeding the output of ls to od also shows that the exact same bytes occur no matter what the locale is. I did some experiments in tcl as well, all showing that the filenames report the same number of bytes, the number expected for UTF-8, no matter what the locale is. So far I have been pretty happy with OSX, they seem to have their brains together and don't assign magical properties to strings, considerably better than the morons writing a lot of Linux and Windows.
Feeding an invalid UTF-8 string to the api should result in exactly the same sort of result as feeding an invalid UTF-16 string to the Windows wchar api (a UTF-16 string can easily be invalid by having an unmatched surrogate half).
I am reasonably certain that Windows will go and create the file despite the invalid UTF-16. In fact they never test for this, because they are well aware that it would be a total pain in the ass for anybody programming to have to make sure an array contains some magical pattern before it can be used as a filename.
However these very same people will often then think that UTF-8 must be treated special. They will literally say an array is "not UTF-8" because it has invalid sequences in it (you did above, for instance). This is like saying an newspaper is not in English if there is a mispelled word in it. Technically true but pretty useless and it is far better to treat it as English than to define some new language for it. An easy way to see if they are either hypocrites or (more likely) the idiot savants is to see what the same programmers do about invalid UTF-16. They will often act completely different, proving their total illogical stance.
Due to the unfortunate use of 16-bit strings in a lot of apis, the best solution so far has been to preserve UTF-8 as long as possible, then do the lossy conversion by taking each error byte (which will always have the high bit set) and encode at as a high-order surrogate half character in the range 0xDC80-0xDCFF. This is therefore invalid UTF-16 so on the off chance the system actually produces an "invalid encoding" error, they will get one for UTF-8 as well. I think Windows does not do this, so it will instead create a file.
This whole thing is far from perfect but we have to live with a bunch of politically-correct idiots who thought wchar was the solution.
Then the file system driver translates the 16 bit filenames to an 8 bit encoding. At no point does the kernel or runtime api see anything other than 8-bit encodings.
I strongly suspect that there are equally idiotic people writing these drivers as it sounds like there are at Microsoft, and the translation may depend on the "locale" rather than being forced to UTF-8. In that case I suspect Linux is going to be nearly useless reading these disks until they fix that, for the same reason this SFU stuff is going to be useless on Windows.
This is not a limitation of the compiler. You cannot set UTF-8 with GCC either. The actual system API refuses.
The LC_ALL and similar environment variables should not effect how filenames are stored on disk. That would be incompatible with POSIX. Although I think it would be best to ignore them entirely, if you wanted to make the POSIX subsystem obey them they must limit themselves to changing printf and scanf functions.
If POSIX "open()" maps the the "mb" api in WIndows then the Windows "locale" MUST be set to UTF-8 with a mode such that errors turn unambigoulsly into UTF-16 errors without throwing exceptions (it is not necessary that these files actually open, only that they return the same type of error that other invalid characters do). Any other situation makes this POSIX api useless.
As on Unix,
Wrong. Unix returns the 8-bit data UNCHANGED from the disk. The locale does not alter what filenames are seen.
for the likes of readdir and other multibyte (rather than widechar) functions, it uses the current locale to encode filenames.
Okay then the answer is NO. It is not possible to name every file. Therefore this SFU is useless.
If your system locale is set to UTF-8, I believe it will feed you UTF-8 (mine is set to CP1251, and for some mysterious reason it requires a reboot to change, so I can't check this for sure - sorry).
You have now identified the obvious solution. However from everything I have tried, it appears to be impossible for a program to force the API to UTF-8. You have to set a registry or something and start a new login. The fact that this is impossible leads me to believe that Microsoft is still actively trying to prevent compatibility. It is also likely that "UTF-8" locale will throw exceptions, crash, or do other strange stuff when the UTF-8 byte stream contains errors, this makes it extremely unreliable.
OSX uses an 8-bit api to pass filenames to the kernel and thus to the filesystem drivers.
It is quite possible the filesystem does UTF-8 to UTF-16 conversion when looking up files. That is irrelevant, except for the fact that this same feature is needed for Windows to correctly support POSIX.
It does appear Windows is going to continue to be broken, as you say it uses the "current locale" to translate the filenames from POSIX into the ones on the disk. This is unlike OSX and also totally incompatible with any 8-bit filesystems and with Unix. Unix relies on a given string identifying the same file no matter what the "locale" is.
This is trivial to fix by forcing the "locale" to be UTF-8. The fact that Microsoft does not do this, and seems to distinctly avoid providing an interface so a program can decide to do so (I don't think it requires a reboot as you say, but it does need a new shell created), indicates to me that they are still, on purpose, making sure portability is impossible.
I think the general complaint is that the "ribbon" really isn't new. There have been "tabs" for a long, long time, including in Microsoft software. And some toolkits (not sure about MFC) consider the "menubar" to be a normal widget and therefore can be put inside a "tab". Furthermore many toolkits have considered a "menubar" to not only contain "submenus" but also "buttons" (often as a submenu title with no children). I certainly did this in a toolkit that is now almost 20 years old (see fltk).
Like a lot of Microsoft stuff, a lot of the angst is that Microsoft pretends they have come up with something *new* when in fact it can be a decades old idea. Also that they absolutely refuse to use any existing term for that idea, in order to defeat searches for prior art (ie the term "ribbon" rather than "tabs" which is certainly the keyword all previous versions used).
Conversely I think the anti-Microsoft crowd is to blame. They sit on their duff, scared to death of being "hard to use" or the dreaded "inconsistent" and thus refusing to actually incorporate any of these new ideas into actual end user products. So nobody actually sees these except for the programmers at Microsoft who examine the code and decide what to steal. And then when Microsoft comes up with it, they try to project their own fears onto them and try to say "oh Microsoft is being inconsistent and hard to use!!!"
You know there's an UTF-8 multibyte locale in Windows, right?
Yes I know that. You know it is IMPOSSIBLE for a program to force this to turn on? Maybe want to explain that little detail to us?