I would guess that the problem with BSD/Solaris (and Irix, the one we had trouble with) is that the extensions for the shared library is.so as well and thus you can't put it in the same directory.
To allow multiple-platform plugins to work we initially named the.so with different extensions for each platform. This did not go over too well, as anybody knowledgeable enough to want to develop plugins was immediately stuck because it was not obvious what they were or how you write them. Renaming them back to.so/.dll/.dylib when we dropped Irix support, and suddenly we started getting support questions for how to write them, so it seems apparent that the extension was a real stumbling block.
Who said people like autotools? You don't seem to be too aware of Linux programming if you think people are in favor of them...
I have NEVER seen autotools used for development. Only if somebody wants to make source that Unix users can download and compile does that stuff ever get written, and it is considered a final and very annoying step.
I think a working and installed-by-default Unix compatability layer in Windows would be the very first thing added. That would attract a lot of open source developers interest.
(not that any of this is ever going to happen. Microsoft is never going to publish the code in a form that anybody will be able to or interested in modifying. And they are unlikely to ever publish the source at all, even if they wanted to, because it would take a huge effort to locate and replace all the code aquired from 3rd parties that they have no right to redistribute).
It works if the number of people is small enough that they can't easily find each other and instead have to find you and buy from you. I'm not sure if this really happens with any sample greater than 1. But GPL software certainly is sold at that sort of quantity, every IT person who has set up a Linux system and been paid for it has in fact sold GPL software. Yea the company can copy the whole setup and give it to another company but they don't do that.
I think from a "I want to install this program" point of view you are looking at it wrong. Linux certainly has some problems, but some of your other items are much better: in particular the location of config files (mostly because you don't want to be different than the other programs, although there certainly are some programs that want to read other configurations), the endless annoyance of "how to make my program show up on the start menu" (wtf happened to "put a file here and it's on the menu", the Gnome/freedesktop stuff that requires you to exec a program is STUPID!!!). And the completely unsolved problem of "how do I make a new file association" (at least for me, I have not a clue!).
I'm not sure how many programs want to change the default printer, which is why I think this is a poor example.
I think you misunderstand. By "change resolution" I meant a program changes the resolution to something it wants, takes over the screen, and changes the resolution back on exit". Not the fact that screens have different maximum resolutions.
It was 100% done because games were too slow unless the number of pixels they needed to draw was kept small, and that was the only way to get a large display (plus putting the game in a Windows window removed almost all the hardware acceleration).
This program-controlled changing of resolution has been obsolete from the moment LCD screens became popular, so it is probably just as well that Linux never did it.
Responding to the screen changing (such as rotation) is something that should be supported however. And the user should be able to fix the resolution (for ALL programs) using some control panel.
That's probably the new RandR extension to X. However if that exists it is the same on all platforms, it would be an X call and you would not interact with whatever is drawing that dialog. I personally have never seen it work (probably because I am using the Nvidia drivers) and cannot get the screen resolution to change without Ctrl+Alt+Backspace.
Most of your list is differences between the distros, but screen resolution is not one of them. Determining screen resolution is exactly the same on all Linux distributions that use X. Changing the screen resolution is exactly the same (ie impossible) on all distributions as well. The fact that it is impossible may have confused you.
How about: they go to the website, and are presented with ONE distro.
Hmm that seems to be what you get from Ubuntu.
The fact that any joe blow can burn a CD and make up a catchy name and claim they made a "distro" does not mean that there really are hundreds of distros. I can go to the store and get a few thousand versions of Windows if you count that way (all kinds of options as to whether Firefox is inlcuded, etc).
Fuck yea I would prefer to give command line functions over the phone than try to direct somebody through a GUI over the phone! Have you ever done either of these?
Actually there is an xlib call to return the size of the desktop. Quicken could then pop up an error if it is too small. This is quite portable to every distro.
Attempting to change the desktop resolution is stupid and has been for years now, ever since LCD screens became prominent. That was only a feature of Windows because when it was introduced, lowering (not raising) the resolution was the only way to get games to run fast enough.
So either everyone learns what "apt-get" does (not to mention how to use a command line interface in the first place)
You may not have realized it, but there was an amazing development in computer science called the exec() call. This has the INCREDIBLE abiltiy to allow a GUI to run a function even though that function is described by a CLI line. I know this is hard to believe, apparently for many years it was believed impossible, so that people who are so smart that they can type and push Submit buttons still think it cannot happen. But it has been done! Even Linux does it!
If you count distributions by how many different people made one, then you better start counting Windows installations by what the installer copied. Some have Firefox, some dont. Some have Word, some don't. Some have adware from the computer manufacturer. There are THOUSANDS of versions of Windows if you count it that way.
The idea that a beginner is going to be confused by hundreds of Linux distros is absurd. The biggest hurdle is for the beginner to hear about even ONE distro. And that one will either be: 1 whatever Linux is installed on the machine they bought. 2: Ubuntu (with no letters added to the start). 3: RedHat if they work in a business environment.
Maybe, just maybe, you could throw 3 or 4 more in there that somebody will ACTUALLY see.
But this "hundreds of distros" is just bullshit and you know it.
The LSB has a mess of Unix history in it, I think.
There are far too many directories for reasons that have been obsolete for 30 years (basically they have been obsolete since the symbolic link was invented).
If you have the name of a program and you want to find it's global configuration file, you should be able to look at precisely ONE name that can be stored in a string constant. If there is some compelling reason why it should be on a different disk, then the system manager can make a symbolic link from that first location to point to where they want the file. DONE. No environment variables and no search paths!
One problem is that most people who propose this also propose changing *all* the names, such as changing/etc to/Configurations or something. I think the real solution is to pick one of the LSB names and put everything in there, ie/etc for config files,/bin for programs you run from the shell, etc.
Also package directories for programs would be nice. Would require a small modification of shells and exec so that running Foo would run Foo/Foo if it is a directory (and possibly set LD_LIBRARY_PATH so the directory is first). Please don't copy the nonsense many layers that OS/X uses for it's "Bundles".
If you took all the people who install and customize Windows and put their own settings on it, there are THOUSANDS of "distributions" of Windows, too.
Just because somebody can make a disk and make up a clever name does not mean they have made a distro. Somebody other than that person's friends has to use it.
I think if you count it that way the number goes down to 20 or 30. And if you ignore stuff that is clearly not designed for the desktop, it goes down to 6.
Depends on what the community's goal is, if it is to increase marketshare and just generally make software better for everyone by decreasing Microsoft's dominance then it has to be more than just a hobby.
You are missing the actual explanation, neither of yours is right:
It is indeed a hobby, with a lot of very committed people practicing it and enjoying it and wanting to continue doing so. But just the ability to continue doing the hobby means that you must decrease Microsoft's dominance, or it will quickly become impossible. These people want to know that they could completely understand and control how their computer works, but they also want that computer to function in the modern world, be built with modern components, and use modern techniques. All this is illegal or impossible if Microsoft wins.
The driving force behind FOSS is just people who want their personal and understandable computer to work. If that means they have to convince 20% or so of the non-hobbyist population to user their stuff, then they are going to work VERY HARD to make that happen. Nobody enjoys reverse-engineering Microsoft's spew or living with nasty and ugly techinical compromises in order to talk to Microsoft stuff. The hobby would be far better without that. But it has to be done for the hobby to survive.
One reason I think the push behind Linux dropped in the last few years was the emergence of OS/X. I personally feel that if there were two equal competitors fighting over the computer market, but forced to be open and interoperate with each other, such that there was no problem for a hobbyist to make their own compatible system, there would be nearly no push behind Linux, and in fact it would have remained about as popular as Minix.
IMHO he did a stupid thing by taking that job. He had to sign employment agreements and contracts and they most likely invalidated his claims. And he certainly copied documents he was not allowed to copy as an employee.
I don't understand, what exactly would I do before the final presentation step?
Everything I can think of does not require thinking about "characters". Occasionally the Unicode code points become important (mostly they are needed to translate between encodings), but I am unable to come up with an example that does not also iterate over the string. Iteration makes it trivial to handle variable-length encodings.
I went to read the page, unfortunatly the Python guys are making a few mistakes.
BOM is *not* wanted on UTF-8, it destroys the important aspect that UTF-8 is compatible with ASCII. The fact that Windows relies on this leads to the "bush hid the facts" bug where it thinks plain ASCII is UTF16LE. They should not be supporting this idiot idea. UTF-8 can be very reliably identified by looking to see if there are invalid UTF-8 sequences, in real text it is virtually impossible for this to happen because the necessary character sequences in ISO-8859-1 (or any other byte encoding) are not meaningful text, and even random text has a very tiny (appx 1/20^N where N is the number of bytes) chance of matching.
Also if they do not throw an exception when a UTF-16 filename is read on Windows and it contains non-BOM characters, they are using UTF-16. No amount of calling it UCS-2 will make that true. This would be the same as saying "I'm reading this UTF-8 data, but I will call it "ASCII" and therefore I never have to worry about multibyte characters, they will magically disappear because I said it is ASCII".
I can't test this, but I am pretty certain they are taking Windows UTF-16 and putting each word unchanged into their "UCS-2" strings. Therefore they are using UTF-16 but they have broken translators to UTF-8.
The cutoff is 0x10ffff, which is 1114111 decimal. They seem to have this right, though I never saw it shown in decimal before. You added a 0x to the start, which is wrong.
s[3] would return the 3rd byte. Anybody who thinks otherwise has obviously not done any serious work with Unicode storage.
If you really want, you could make a method s.getUnicodeCodePoint(n) which would return the n'th code point. You could also make a function thats return the n'th character as users see them (ie it does combining and invisible characters), return the n'th word, the n'th piece where the ink does not touch, the n'th syllable, etc. In fact there are a million interesting ways of breaking up a Unicode string, and most of them are complicated. And also none of them are used except in the final presentation step, including your "character" example.
The belief that "characters" are important is due to the fact that in ASCII an offset into a string was equal to the "number of characters". This led to lots of UI being documented as taking "characters" when in fact the purpose of the function is to produce or consume an offset
Amateur programmers who think that s[3] should return something other than what is at 3*constant_size into the array are probably the biggest impediment to I18N, as they raise this silly argument every time UTF-8 or UTF-16 is suggested and it is really really hard to make them see the light. Very sad really. And scary, as most of them are just bright enough to be dangerous as they could probably write the insanely slow and useless functions they think are needed.
HP's popups are also on Macintosh. I have not figured out how to log in and not have it pop up a "configure your networked printers" dialog. Oh well, I learned you can cancel it and keep going (and the HP printer+scanner works fine!).
Python uses UTF-16 internally on Windows and UTF-32 on Unix (I think "UCS-4" implies that values greater than 0x10ffff are allowed, but Pythons converters to UTF-16 and UTF-8 do not handle this, so it is better to say it supports UTF-32).
Because Python accepts UTF-16 returned by Windows unchanged (ie when you list the files in a directory, or read data from a UTF-16 file), it is using UTF-16. No amount of calling it UCS-2 will change that. In addition the converter converts non-BMP characters from UTF-8 to the correct len=2 sequence of UTF-16, although the reverse converter is broken and will return 6 bytes.
I did not know Python3 implemented the suggestion to do U+D8xx. I only saw people requesting this because otherwise handling UTF-8 data is nearly impossible. If Python really did do this it will help enormously. However (after looking at this a great deal) I think they should *not* do the reverse translation (ie turn U+D8xx back into bytes). The reason is that it will destroy the current nice fact that UTF-16->UTF-8->UTF-16 is lossless (imagine the UTF-16 had U+D8xx characters arranged such that the result is a legal encoding). And it does not make the inverse UTF8->16->8 lossless (the UTF-8 could have the 2-byte encoding of 0xD8xx in it).
The fact that Unix uses 4 bytes and Windows 2 has been a total source of pain for us. Many end users replace their Linux Python with one recompiled for 2 bytes, because (despite naive beliefs to the contrary) the difference causes code to be non-portable even if it appears to not be doing anything remotely complex with strings. This then causes us grief because we are not necessarily binary-compatible with the installed Python on Linux, forcing us to statically link our own copy. This then leads to complaints when people can't load their Python plugins into our code. I believe that if they had used UTF-8 from the start none of this crap would be happening.
I would guess that the problem with BSD/Solaris (and Irix, the one we had trouble with) is that the extensions for the shared library is .so as well and thus you can't put it in the same directory.
To allow multiple-platform plugins to work we initially named the .so with different extensions for each platform. This did not go over too well, as anybody knowledgeable enough to want to develop plugins was immediately stuck because it was not obvious what they were or how you write them. Renaming them back to .so/.dll/.dylib when we dropped Irix support, and suddenly we started getting support questions for how to write them, so it seems apparent that the extension was a real stumbling block.
Who said people like autotools? You don't seem to be too aware of Linux programming if you think people are in favor of them...
I have NEVER seen autotools used for development. Only if somebody wants to make source that Unix users can download and compile does that stuff ever get written, and it is considered a final and very annoying step.
I think a working and installed-by-default Unix compatability layer in Windows would be the very first thing added. That would attract a lot of open source developers interest.
(not that any of this is ever going to happen. Microsoft is never going to publish the code in a form that anybody will be able to or interested in modifying. And they are unlikely to ever publish the source at all, even if they wanted to, because it would take a huge effort to locate and replace all the code aquired from 3rd parties that they have no right to redistribute).
It works if the number of people is small enough that they can't easily find each other and instead have to find you and buy from you. I'm not sure if this really happens with any sample greater than 1. But GPL software certainly is sold at that sort of quantity, every IT person who has set up a Linux system and been paid for it has in fact sold GPL software. Yea the company can copy the whole setup and give it to another company but they don't do that.
I think from a "I want to install this program" point of view you are looking at it wrong. Linux certainly has some problems, but some of your other items are much better: in particular the location of config files (mostly because you don't want to be different than the other programs, although there certainly are some programs that want to read other configurations), the endless annoyance of "how to make my program show up on the start menu" (wtf happened to "put a file here and it's on the menu", the Gnome/freedesktop stuff that requires you to exec a program is STUPID!!!). And the completely unsolved problem of "how do I make a new file association" (at least for me, I have not a clue!).
I'm not sure how many programs want to change the default printer, which is why I think this is a poor example.
I think you misunderstand. By "change resolution" I meant a program changes the resolution to something it wants, takes over the screen, and changes the resolution back on exit". Not the fact that screens have different maximum resolutions.
It was 100% done because games were too slow unless the number of pixels they needed to draw was kept small, and that was the only way to get a large display (plus putting the game in a Windows window removed almost all the hardware acceleration).
This program-controlled changing of resolution has been obsolete from the moment LCD screens became popular, so it is probably just as well that Linux never did it.
Responding to the screen changing (such as rotation) is something that should be supported however. And the user should be able to fix the resolution (for ALL programs) using some control panel.
That's probably the new RandR extension to X. However if that exists it is the same on all platforms, it would be an X call and you would not interact with whatever is drawing that dialog. I personally have never seen it work (probably because I am using the Nvidia drivers) and cannot get the screen resolution to change without Ctrl+Alt+Backspace.
Most of your list is differences between the distros, but screen resolution is not one of them. Determining screen resolution is exactly the same on all Linux distributions that use X. Changing the screen resolution is exactly the same (ie impossible) on all distributions as well. The fact that it is impossible may have confused you.
That is just stupid.
How about: they go to the website, and are presented with ONE distro.
Hmm that seems to be what you get from Ubuntu.
The fact that any joe blow can burn a CD and make up a catchy name and claim they made a "distro" does not mean that there really are hundreds of distros. I can go to the store and get a few thousand versions of Windows if you count that way (all kinds of options as to whether Firefox is inlcuded, etc).
Fuck yea I would prefer to give command line functions over the phone than try to direct somebody through a GUI over the phone! Have you ever done either of these?
Actually there is an xlib call to return the size of the desktop. Quicken could then pop up an error if it is too small. This is quite portable to every distro.
Attempting to change the desktop resolution is stupid and has been for years now, ever since LCD screens became prominent. That was only a feature of Windows because when it was introduced, lowering (not raising) the resolution was the only way to get games to run fast enough.
So either everyone learns what "apt-get" does (not to mention how to use a command line interface in the first place)
You may not have realized it, but there was an amazing development in computer science called the exec() call. This has the INCREDIBLE abiltiy to allow a GUI to run a function even though that function is described by a CLI line. I know this is hard to believe, apparently for many years it was believed impossible, so that people who are so smart that they can type and push Submit buttons still think it cannot happen. But it has been done! Even Linux does it!
If you count distributions by how many different people made one, then you better start counting Windows installations by what the installer copied. Some have Firefox, some dont. Some have Word, some don't. Some have adware from the computer manufacturer. There are THOUSANDS of versions of Windows if you count it that way.
The idea that a beginner is going to be confused by hundreds of Linux distros is absurd. The biggest hurdle is for the beginner to hear about even ONE distro. And that one will either be: 1 whatever Linux is installed on the machine they bought. 2: Ubuntu (with no letters added to the start). 3: RedHat if they work in a business environment.
Maybe, just maybe, you could throw 3 or 4 more in there that somebody will ACTUALLY see.
But this "hundreds of distros" is just bullshit and you know it.
The LSB has a mess of Unix history in it, I think.
There are far too many directories for reasons that have been obsolete for 30 years (basically they have been obsolete since the symbolic link was invented).
If you have the name of a program and you want to find it's global configuration file, you should be able to look at precisely ONE name that can be stored in a string constant. If there is some compelling reason why it should be on a different disk, then the system manager can make a symbolic link from that first location to point to where they want the file. DONE. No environment variables and no search paths!
One problem is that most people who propose this also propose changing *all* the names, such as changing /etc to /Configurations or something. I think the real solution is to pick one of the LSB names and put everything in there, ie /etc for config files, /bin for programs you run from the shell, etc.
Also package directories for programs would be nice. Would require a small modification of shells and exec so that running Foo would run Foo/Foo if it is a directory (and possibly set LD_LIBRARY_PATH so the directory is first). Please don't copy the nonsense many layers that OS/X uses for it's "Bundles".
If you took all the people who install and customize Windows and put their own settings on it, there are THOUSANDS of "distributions" of Windows, too.
Just because somebody can make a disk and make up a clever name does not mean they have made a distro. Somebody other than that person's friends has to use it.
I think if you count it that way the number goes down to 20 or 30. And if you ignore stuff that is clearly not designed for the desktop, it goes down to 6.
Depends on what the community's goal is, if it is to increase marketshare and just generally make software better for everyone by decreasing Microsoft's dominance then it has to be more than just a hobby.
You are missing the actual explanation, neither of yours is right:
It is indeed a hobby, with a lot of very committed people practicing it and enjoying it and wanting to continue doing so. But just the ability to continue doing the hobby means that you must decrease Microsoft's dominance, or it will quickly become impossible. These people want to know that they could completely understand and control how their computer works, but they also want that computer to function in the modern world, be built with modern components, and use modern techniques. All this is illegal or impossible if Microsoft wins.
The driving force behind FOSS is just people who want their personal and understandable computer to work. If that means they have to convince 20% or so of the non-hobbyist population to user their stuff, then they are going to work VERY HARD to make that happen. Nobody enjoys reverse-engineering Microsoft's spew or living with nasty and ugly techinical compromises in order to talk to Microsoft stuff. The hobby would be far better without that. But it has to be done for the hobby to survive.
One reason I think the push behind Linux dropped in the last few years was the emergence of OS/X. I personally feel that if there were two equal competitors fighting over the computer market, but forced to be open and interoperate with each other, such that there was no problem for a hobbyist to make their own compatible system, there would be nearly no push behind Linux, and in fact it would have remained about as popular as Minix.
U.S. Patent No. 6,411,941
Any opinions on whether this is bogus or obvious?
IMHO he did a stupid thing by taking that job. He had to sign employment agreements and contracts and they most likely invalidated his claims. And he certainly copied documents he was not allowed to copy as an employee.
I don't understand, what exactly would I do before the final presentation step?
Everything I can think of does not require thinking about "characters". Occasionally the Unicode code points become important (mostly they are needed to translate between encodings), but I am unable to come up with an example that does not also iterate over the string. Iteration makes it trivial to handle variable-length encodings.
I went to read the page, unfortunatly the Python guys are making a few mistakes.
BOM is *not* wanted on UTF-8, it destroys the important aspect that UTF-8 is compatible with ASCII. The fact that Windows relies on this leads to the "bush hid the facts" bug where it thinks plain ASCII is UTF16LE. They should not be supporting this idiot idea. UTF-8 can be very reliably identified by looking to see if there are invalid UTF-8 sequences, in real text it is virtually impossible for this to happen because the necessary character sequences in ISO-8859-1 (or any other byte encoding) are not meaningful text, and even random text has a very tiny (appx 1/20^N where N is the number of bytes) chance of matching.
Also if they do not throw an exception when a UTF-16 filename is read on Windows and it contains non-BOM characters, they are using UTF-16. No amount of calling it UCS-2 will make that true. This would be the same as saying "I'm reading this UTF-8 data, but I will call it "ASCII" and therefore I never have to worry about multibyte characters, they will magically disappear because I said it is ASCII".
I can't test this, but I am pretty certain they are taking Windows UTF-16 and putting each word unchanged into their "UCS-2" strings. Therefore they are using UTF-16 but they have broken translators to UTF-8.
The cutoff is 0x10ffff, which is 1114111 decimal. They seem to have this right, though I never saw it shown in decimal before. You added a 0x to the start, which is wrong.
s[3] would return the 3rd byte. Anybody who thinks otherwise has obviously not done any serious work with Unicode storage.
If you really want, you could make a method s.getUnicodeCodePoint(n) which would return the n'th code point. You could also make a function thats return the n'th character as users see them (ie it does combining and invisible characters), return the n'th word, the n'th piece where the ink does not touch, the n'th syllable, etc. In fact there are a million interesting ways of breaking up a Unicode string, and most of them are complicated. And also none of them are used except in the final presentation step, including your "character" example.
The belief that "characters" are important is due to the fact that in ASCII an offset into a string was equal to the "number of characters". This led to lots of UI being documented as taking "characters" when in fact the purpose of the function is to produce or consume an offset
Amateur programmers who think that s[3] should return something other than what is at 3*constant_size into the array are probably the biggest impediment to I18N, as they raise this silly argument every time UTF-8 or UTF-16 is suggested and it is really really hard to make them see the light. Very sad really. And scary, as most of them are just bright enough to be dangerous as they could probably write the insanely slow and useless functions they think are needed.
HP's popups are also on Macintosh. I have not figured out how to log in and not have it pop up a "configure your networked printers" dialog. Oh well, I learned you can cancel it and keep going (and the HP printer+scanner works fine!).
Does Flash do that?
The proposed solution is to translate all the encodings *TO* Unicode, then compare them. Not the other way around.
Unicode was purposely designed to have a symbol for every symbol in any other encoding used at the time, so that this is possible.
Python uses UTF-16 internally on Windows and UTF-32 on Unix (I think "UCS-4" implies that values greater than 0x10ffff are allowed, but Pythons converters to UTF-16 and UTF-8 do not handle this, so it is better to say it supports UTF-32).
Because Python accepts UTF-16 returned by Windows unchanged (ie when you list the files in a directory, or read data from a UTF-16 file), it is using UTF-16. No amount of calling it UCS-2 will change that. In addition the converter converts non-BMP characters from UTF-8 to the correct len=2 sequence of UTF-16, although the reverse converter is broken and will return 6 bytes.
I did not know Python3 implemented the suggestion to do U+D8xx. I only saw people requesting this because otherwise handling UTF-8 data is nearly impossible. If Python really did do this it will help enormously. However (after looking at this a great deal) I think they should *not* do the reverse translation (ie turn U+D8xx back into bytes). The reason is that it will destroy the current nice fact that UTF-16->UTF-8->UTF-16 is lossless (imagine the UTF-16 had U+D8xx characters arranged such that the result is a legal encoding). And it does not make the inverse UTF8->16->8 lossless (the UTF-8 could have the 2-byte encoding of 0xD8xx in it).
The fact that Unix uses 4 bytes and Windows 2 has been a total source of pain for us. Many end users replace their Linux Python with one recompiled for 2 bytes, because (despite naive beliefs to the contrary) the difference causes code to be non-portable even if it appears to not be doing anything remotely complex with strings. This then causes us grief because we are not necessarily binary-compatible with the installed Python on Linux, forcing us to statically link our own copy. This then leads to complaints when people can't load their Python plugins into our code. I believe that if they had used UTF-8 from the start none of this crap would be happening.
Of course nobody calls Linux users 'fanbois', right?
Pot/Kettle and all that...