Unicode in itself is an attempt to make completely artificial, huge charset mandatory for everything to support
What the heck does "completely artificial" mean? All charsets are artificial. It's only about twice the size of BIG5 and SJIS, which are your alternatives for Asian support.
including devices that can't even fit Unicode font into their memory.
What do you mean by "Unicode font"? No one expects most fonts to include more than a small subset of Unicode, and there's no reason why a Unicode font that contains ISO 8859-1 subset should be any larger than an ISO 8859-1 font.
There were some attempts to support multiple charsets in the same text
It's known as ISO2022. It's been around forever, and no one's stopping you from using it. It's used for COMPOUND_TEXT in X and MULE in Emacs. Most people don't like it because it's a state-heavy system. No one killed it by backroom poltics - it just didn't go over very well.
Most of Unicode-should-replace-everything support emanates from people who use ISO 8859-1 encoding, that happens to be exactly the same as first 255 characters of Unicode, so they don't have to modify anything in non-trivial manner, and can just cut their fonts to fit them everywhere.
Huh? recode l1..utf-8 is as difficult as recode koi8r..utf-8. As for fonts . . . welcome to the 21st century. Postscript fonts label characters by name, and Truetype fonts have always been Unicode IIRC, so the only fonts that that need recoding are BDF fonts. There are nice tools to do that automatically.
From other posts: Precisely because all work on standards, formats and libraries that would do it for the programmers is stopped to benefit "Unicoders" who taken over the standardization process.
Woohoo! All your base are belong to us!
Have you ever thought that you're being just a touch paranoid? Unicode fans are putting work into things to get Unicode to work. You're welcome to join the standards committees and put in your work. Or, alternetly, create the tools you're talking about and make them a defacto standard.
Unicode loses all distinctions between languages used in the text
So does every other character set in the world. ISO8859-1 doesn't tell you what language the content is in; neither does SJIS. Frankly, I don't know where the information is coming from; I'm working on a multilingual webpage, and there's no way in heck you're going to get me to go back through a hundred page document to put in language information. If HTML were ISO2022 based, I'd use the ISO8859-3 charset for the whole document.
making impossible to do any complex processing that must know the language.
In the.1% of cases where you're dealing with multilingual text and you need to do complex processing on it, you're going to need to use some document specific language tags (XML tags, Unicode Plane 14 tags, whatever.)
Most people aren't going to put in the tags anyway, and the computer can't tell whether you're typing French or English, so any system that requires tags is going to get a lot of mistagged documents.
However it's obvious that all Unicode-using code can't deal with stateful text stream (text + language as state) because the whole point of Unicode was to avoid any state,
Part of the goal was to minimize state, yes. But state does exist in Unicode - BIDI, for example, which is nesseccary to use Hebrew and Arabic. And of course any code that wanted to use language tags would support language tag state!
and XML processing programs have no requirement to preserve attributes i their internal processing,
There's an English saying: You can lead a horse to water, but you can't make him drink. If programs want to discard tagging information, they will and there's nothing anyone can do to stop them. If they want the information, then it's there.
But which subset should one support in any given situation?
Whatever Unicode subset you need? If you need to support Europe, you can look at MES-1, -2 and -3, successively larger subsets of European characters. If you need Japenese, I'd suggest the subset of Unicode corresponding to JIS 0213. And so on.
How does this question differ from what charsets to support?
You might be a little easier to take seriously if you stopped the paranoid rantings about the evils of Unicoders and rationally discussed the problems with Unicode.
In otherwards, 1984. The government doesn't care about what uneducated people say, but the educated (Dr. Martin Luther King; Martin Luther; Mohmatas Ghandi) can change the world, and hence are dangerous to the status que. Giving the power to the government to watch just the group of people who are learning how to change the world is outrageous.
There's no such thing as absolute security, but there's a big difference someone having the legal right to watch everything you write and someone breaking the law to do it.
What do you mean, my dorm room is not my personal residence? I live here, and pay for the privilege. Would you be so happy if your employer forced you to live in a certain apartment building, and forced you to use the LAN instead of a modem, and then told you that they could listen in on anything on that LAN?
| I will refrain from commenting on your last
| paragraph. But be careful with this line of
| reasoning. You are on the very edge of racism
| there.
That's one of the things I hate about modern science - it uses social critera to judge hypothesis. Things should be judged true or not based on whether the evidence supports it, not whether they're support the social goals of the day.
Why do you assume that Debian GNU/BSD would GPL the kernel? The Debian project does not advocate one free license over another (with the exception of patch clauses), and there's not that many GPL-bigots in Debian. We also have no plans to include (much less GPL) a distro. As I discuss elsewhere on this page, if it happens, it will probably just be the kernel that Debian uses.
The Debian BSD project is a mailing list. The only way it's going anywhere is if someone decides it's worth a whole of dedicated effort, and puts that effort in, and they will probably end up dictating the general shape of the final result.
But Debian has a userland already, and except for a few low-level tools, it should all work fine with a BSD kernel, and that would minimize the problems with integrating it. A true BSD with apt is an interesting prospect, but a little out of Debian's aegis.
So, yes, one of the most likely shapes of Debian GNU/BSD would be to take the kernel from a BSD and replace the Linux kernel with it, minizing the other changes.
It shouldn't be a matter of dumping the Hurd - if Debian dumped the Hurd, we'd lose the developers that develop the Hurd. But if you want Debian GNU/BSD, subscribe to debian-bsd@lists.debian.org and start working. The biggest problem with Debian GNU/BSD is interested workers, not anything political and certainly not anything having to do with the Hurd.
The g++ compiler does not use C as an intermediate language. It never has. I defy you to show me the switch that will produce C from your C++ code. Run/usr/lib/gcc-lib/i386-linux/2.95.3/cc1plus (with appropriate values substituted for i386-linux and 2.95.3, of course) by hand one day - you can feed it preprocessed source and watch it spit out assembly. Alternately, compile a pure C++ program one day and watch top. You'll see cc1plus, and gas/as in there, but you won't see cc1 (the C compiler, gcc only being a driver) in there, because it's not used.
Project Gutenberg has some non-public domain works, but they're a lot more discriminating than the public domain texts. They don't really have the ability/resources to judge whether a book is worth it archiving or not (as the questions in this discussion about reviewing the quality.)
By Jim Tinsley on http://promo.net/pg/vol/wwwboard/messages/1580.htm l :
1. How do we know that a self-published work has merit? We're not literary critics - or biblical scholars. In the case of a century-old book into which a volunteer puts 30 or 40 hours of work, we accept that it has survived well enough to inspire a serious, egoless commitment from a modern reader, and that's good enough for us. But we also get offers of essays, stories and books from modern writers who may just be seeking something to boast about.
Nonsense. The problem here is binary compatibilty, not source compatibility. UNIX has a billion different binary-incompatible implementations.
I've worked on a program that is supposed to run on many Unix systems. Is there a drem, and if not, is there a function that works like it? Linux has a working drem. A UNIX system may, or may not, and any replacement is not standardized. Looking at autoconf stuff for several programs, that's not unique to mdate and drem.
Where does Linux not fit the "open industry standards"? Linux has an ANSI C89 compiler and conforms closely to the POSIX.1 and.2 standards. GNU Libc also conforms to C99 and various Open Group standards.
(Considering as Motif and CDE only had one implementation and the people who are trying to reimplement Motif have found the documentation to be far from sufficent, I would question that they have much more claim to the title of Open System than GNOME/GTK and KDE/Qt.)
And how is any of this the GNU C Libraries fault? Every single function you mentioned has its behavior dictated by an internation standard that is based off Unix's C library, which had all the same behavior and problems. GNU libc must behave as it does to remain conformant to every other C implementation in the world.
The complaint that you're being miseducated can be (and has) be waged against every fictional piece since the beginning of time. Do you complain about Star Wars because it was created in order to give you a view of divinity that you would consider false?
Porn is not didactic, as a general rule. It is not designed to educate. It is not designed to reflect reality. To judge it on those grounds, is absurd.
Re:Nested tables is NOT the problem w/ NS
on
Mozilla .6 Released
·
· Score: 1
The rendering core was one of the first things totally rewritten from Netscape. So just because the bug is Netscape, doesn't mean it's in Mozilla, unless someone reimplemented the bug.
I've lived in those places. If you ask for a phone number, you get something like 4344, because everyone knows to add 327 on the front. 3 more digits may be an annoyance, but it's nothing new.
> Rubbish - the type system is flawed and the syntax is ugly.
Why is the type system flawed? Are you complaining about specific features, or the whole concept of a highly rigid non-inferring type system?
Syntax is in the eye of the beholder. Personally, I like block style (if.. then.. end if) better than C's (if (...) statement;). A true for loop that does everything a for loop should do without being the generic control structure is nice.
I looked at your link to Limbo, and I'm not impressed with Limbo's syntax. Too quiet. I prefer a language to loudly tell me what's going on, rather than putting a lot of meaning into puncation.
Why is it great to compile everything if you are on something other than x86? From what I've heard, some of the slower boxes (m68k, for example) are real pains to compile large programs on. You also don't have any assurance that the code even compiles on your platform. With Debian compiling for Alpha, ARM, Sparc, M68K and PowerPC, there's little difference between i386 and any other platform.
As for having the source on hand ~/Program_Source comes to over 500 MB on my system, and is usually one of the first places to look for needed space. I'm glad I don't have to carry anywhere near the source for every program on my system. I can apt-get source (or buy a CD) if I need the code, and delete it when I get done.
Why would someone who packaged KDE for Debian on his own for a year before QT went GPL and it got into Debian proper, want to sabotage KDE? For that is who packages KDE for Debian now - Ivan Moore (who ran kde.tydc.com).
There's no conspiracy. There's no one unified Debian opinion on KDE, any more than there's one unified Slashdot opinion on KDE. Yes, there are Debian developers who still harbor grudges towards KDE, but they aren't the ones packaging KDE for Debian.
"the high number-crunching performance only C or C++ can give" you? Whatever happened to Fortran? One is not bound to C/C++ in those situations - Fortran or Ada also perform quickly and are decent languages to write in. In fact, if you're doing some types of heavy number-crunching on a PC, Perl or Python work great - the speed of interpreter is not an issue when 95% of the CPU time is spent running LAPACK routines written in highly-tuned Fortran.
Why? Pi is believed to be effectively random, so anything could be found if you use enough digits. It's like finding clues to the author of Shakespear in his works - it could be that it was put there delibertly, but it's more likely the human mind finding patterns where there are none.
It's not a humanity. Humanities deal with human things - note that history of Greece is a humanity, but the history of the dinosaurs (paleotology) is a science - and
math does not deal with human things.
But anthropology is considered a science and deals with humans. Same with sociology. I'm not sure whether I'd consider mathematics a science or not, but the dealing
with humans aspect is irrelevant.
Dealing with humans is nessecary for being a humanities. It could be that it's not sufficent for being a humanity, or that some subjects (like the two you mentioned) are both sciences and humanities, but that doesn't affect the fact that math is not a humanity.
Note I didn't say whether math is a science or not. The world isn't disectable into sciences and humanities.
There are already JVMs and Java compilers under the GPL. And Microsoft has already paied the Kaffe people to add support for Microsoft extensions. This isn't going to change much of anything, except make a whole lot of Java stuff easier for Red Hat, Debian and others.
What the heck does "completely artificial" mean? All charsets are artificial. It's only about twice the size of BIG5 and SJIS, which are your alternatives for Asian support.
including devices that can't even fit Unicode font into their memory.
What do you mean by "Unicode font"? No one expects most fonts to include more than a small subset of Unicode, and there's no reason why a Unicode font that contains ISO 8859-1 subset should be any larger than an ISO 8859-1 font.
There were some attempts to support multiple charsets in the same text
It's known as ISO2022. It's been around forever, and no one's stopping you from using it. It's used for COMPOUND_TEXT in X and MULE in Emacs. Most people don't like it because it's a state-heavy system. No one killed it by backroom poltics - it just didn't go over very well.
Most of Unicode-should-replace-everything support emanates from people who use ISO 8859-1 encoding, that happens to be exactly the same as first 255 characters of Unicode, so they don't have to modify anything in non-trivial manner, and can just cut their fonts to fit them everywhere.
Huh? recode l1..utf-8 is as difficult as recode koi8r..utf-8. As for fonts . . . welcome to the 21st century. Postscript fonts label characters by name, and Truetype fonts have always been Unicode IIRC, so the only fonts that that need recoding are BDF fonts. There are nice tools to do that automatically.
From other posts:
Precisely because all work on standards, formats and libraries that would do it for the programmers is stopped to benefit "Unicoders" who taken over the standardization process.
Woohoo! All your base are belong to us!
Have you ever thought that you're being just a touch paranoid? Unicode fans are putting work into things to get Unicode to work. You're welcome to join the standards committees and put in your work. Or, alternetly, create the tools you're talking about and make them a defacto standard.
Unicode loses all distinctions between languages used in the text
So does every other character set in the world. ISO8859-1 doesn't tell you what language the content is in; neither does SJIS. Frankly, I don't know where the information is coming from; I'm working on a multilingual webpage, and there's no way in heck you're going to get me to go back through a hundred page document to put in language information. If HTML were ISO2022 based, I'd use the ISO8859-3 charset for the whole document.
making impossible to do any complex processing that must know the language.
In the .1% of cases where you're dealing with multilingual text and you need to do complex processing on it, you're going to need to use some document specific language tags (XML tags, Unicode Plane 14 tags, whatever.)
Most people aren't going to put in the tags anyway, and the computer can't tell whether you're typing French or English, so any system that requires tags is going to get a lot of mistagged documents.
However it's obvious that all Unicode-using code can't deal with stateful text stream (text + language as state) because the whole point of Unicode was to avoid any state,
Part of the goal was to minimize state, yes. But state does exist in Unicode - BIDI, for example, which is nesseccary to use Hebrew and Arabic. And of course any code that wanted to use language tags would support language tag state!
and XML processing programs have no requirement to preserve attributes i their internal processing,
There's an English saying: You can lead a horse to water, but you can't make him drink. If programs want to discard tagging information, they will and there's nothing anyone can do to stop them. If they want the information, then it's there.
But which subset should one support in any given situation?
Whatever Unicode subset you need? If you need to support Europe, you can look at MES-1, -2 and -3, successively larger subsets of European characters. If you need Japenese, I'd suggest the subset of Unicode corresponding to JIS 0213. And so on.
How does this question differ from what charsets to support?
You might be a little easier to take seriously if you stopped the paranoid rantings about the evils of Unicoders and rationally discussed the problems with Unicode.
In otherwards, 1984. The government doesn't care about what uneducated people say, but the educated (Dr. Martin Luther King; Martin Luther; Mohmatas Ghandi) can change the world, and hence are dangerous to the status que. Giving the power to the government to watch just the group of people who are learning how to change the world is outrageous.
There's no such thing as absolute security, but there's a big difference someone having the legal right to watch everything you write and someone breaking the law to do it.
What do you mean, my dorm room is not my personal residence? I live here, and pay for the privilege. Would you be so happy if your employer forced you to live in a certain apartment building, and forced you to use the LAN instead of a modem, and then told you that they could listen in on anything on that LAN?
Lucky Starr, as in a series of stories by Paul French (aka Isaac Asimov). See http://homepage.mac.com/jenkins/Asimov/Series.html
for summaries.
Why? GNU != GPL! Many GNU projects aren't under a GPL license, and Linux (the kernel) is not a GNU project.
| I will refrain from commenting on your last
| paragraph. But be careful with this line of
| reasoning. You are on the very edge of racism
| there.
That's one of the things I hate about modern science - it uses social critera to judge hypothesis. Things should be judged true or not based on whether the evidence supports it, not whether they're support the social goals of the day.
Why do you assume that Debian GNU/BSD would GPL the kernel? The Debian project does not advocate one free license over another (with the exception of patch clauses), and there's not that many GPL-bigots in Debian. We also have no plans to include (much less GPL) a distro. As I discuss elsewhere on this page, if it happens, it will probably just be the kernel that Debian uses.
The Debian BSD project is a mailing list. The only way it's going anywhere is if someone decides it's worth a whole of dedicated effort, and puts that effort in, and they will probably end up dictating the general shape of the final result.
But Debian has a userland already, and except for a few low-level tools, it should all work fine with a BSD kernel, and that would minimize the problems with integrating it. A true BSD with apt is an interesting prospect, but a little out of Debian's aegis.
So, yes, one of the most likely shapes of Debian GNU/BSD would be to take the kernel from a BSD and replace the Linux kernel with it, minizing the other changes.
It shouldn't be a matter of dumping the Hurd - if Debian dumped the Hurd, we'd lose the developers that develop the Hurd. But if you want Debian GNU/BSD, subscribe to debian-bsd@lists.debian.org and start working. The biggest problem with Debian GNU/BSD is interested workers, not anything political and certainly not anything having to do with the Hurd.
The g++ compiler does not use C as an intermediate language. It never has. I defy you to show me the switch that will produce C from your C++ code. Run /usr/lib/gcc-lib/i386-linux/2.95.3/cc1plus (with appropriate values substituted for i386-linux and 2.95.3, of course) by hand one day - you can feed it preprocessed source and watch it spit out assembly. Alternately, compile a pure C++ program one day and watch top. You'll see cc1plus, and gas/as in there, but you won't see cc1 (the C compiler, gcc only being a driver) in there, because it's not used.
Project Gutenberg has some non-public domain works, but they're a lot more discriminating than the public domain texts. They don't really have the ability/resources to judge whether a book is worth it archiving or not (as the questions in this discussion about reviewing the quality.)
m l :
By Jim Tinsley on http://promo.net/pg/vol/wwwboard/messages/1580.ht
1. How do we know that a self-published work has merit? We're not literary critics - or biblical scholars. In the case of a century-old book into which a volunteer puts 30 or 40 hours of work, we accept that it has survived well enough to inspire a serious, egoless commitment from a modern reader, and that's good enough for us. But we also get offers of essays, stories and books from modern writers who may just be seeking something to boast about.
Nonsense. The problem here is binary compatibilty, not source compatibility. UNIX has a billion different binary-incompatible implementations.
.1 and .2 standards. GNU Libc also conforms to C99 and various Open Group standards.
I've worked on a program that is supposed to run on many Unix systems. Is there a drem, and if not, is there a function that works like it? Linux has a working drem. A UNIX system may, or may not, and any replacement is not standardized. Looking at autoconf stuff for several programs, that's not unique to mdate and drem.
Where does Linux not fit the "open industry standards"? Linux has an ANSI C89 compiler and conforms closely to the POSIX
(Considering as Motif and CDE only had one implementation and the people who are trying to reimplement Motif have found the documentation to be far from sufficent, I would question that they have much more claim to the title of Open System than GNOME/GTK and KDE/Qt.)
And how is any of this the GNU C Libraries fault? Every single function you mentioned has its behavior dictated by an internation standard that is based off Unix's C library, which had all the same behavior and problems. GNU libc must behave as it does to remain conformant to every other C implementation in the world.
The complaint that you're being miseducated can be (and has) be waged against every fictional piece since the beginning of time. Do you complain about Star Wars because it was created in order to give you a view of divinity that you would consider false?
Porn is not didactic, as a general rule. It is not designed to educate. It is not designed to reflect reality. To judge it on those grounds, is absurd.
The rendering core was one of the first things totally rewritten from Netscape. So just because the bug is Netscape, doesn't mean it's in Mozilla, unless someone reimplemented the bug.
I've lived in those places. If you ask for a phone number, you get something like 4344, because everyone knows to add 327 on the front. 3 more digits may be an annoyance, but it's nothing new.
>> Ada was beautiful...
.. then .. end if) better than C's (if (...) statement;). A true for loop that does everything a for loop should do without being the generic control structure is nice.
> Rubbish - the type system is flawed and the syntax is ugly.
Why is the type system flawed? Are you complaining about specific features, or the whole concept of a highly rigid non-inferring type system?
Syntax is in the eye of the beholder. Personally, I like block style (if
I looked at your link to Limbo, and I'm not impressed with Limbo's syntax. Too quiet. I prefer a language to loudly tell me what's going on, rather than putting a lot of meaning into puncation.
Why is it great to compile everything if you are on something other than x86? From what I've heard, some of the slower boxes (m68k, for example) are real pains to compile large programs on. You also don't have any assurance that the code even compiles on your platform. With Debian compiling for Alpha, ARM, Sparc, M68K and PowerPC, there's little difference between i386 and any other platform.
As for having the source on hand ~/Program_Source comes to over 500 MB on my system, and is usually one of the first places to look for needed space. I'm glad I don't have to carry anywhere near the source for every program on my system. I can apt-get source (or buy a CD) if I need the code, and delete it when I get done.
Why would someone who packaged KDE for Debian on his own for a year before QT went GPL and it got into Debian proper, want to sabotage KDE? For that is who packages KDE for Debian now - Ivan Moore (who ran kde.tydc.com).
There's no conspiracy. There's no one unified Debian opinion on KDE, any more than there's one unified Slashdot opinion on KDE. Yes, there are Debian developers who still harbor grudges towards KDE, but they aren't the ones packaging KDE for Debian.
"the high number-crunching performance only C or C++ can give" you? Whatever happened to Fortran? One is not bound to C/C++ in those situations - Fortran or Ada also perform quickly and are decent languages to write in. In fact, if you're doing some types of heavy number-crunching on a PC, Perl or Python work great - the speed of interpreter is not an issue when 95% of the CPU time is spent running LAPACK routines written in highly-tuned Fortran.
Why? Pi is believed to be effectively random, so anything could be found if you use enough digits. It's like finding clues to the author of Shakespear in his works - it could be that it was put there delibertly, but it's more likely the human mind finding patterns where there are none.
But the cool thing about his approach, is that he can calulate an arbitrary binary digit without finding the in beetween numbers.
If you think about, the fact that the first quadrillion binary digits wer found imply huge jumps in storage space and computer speed.
True, but I understand that prior to about 1980, copyright law didn't cover software. So it could be in the public domain.
But anthropology is considered a science and deals with humans. Same with sociology. I'm not sure whether I'd consider mathematics a science or not, but the dealing with humans aspect is irrelevant.
Dealing with humans is nessecary for being a humanities. It could be that it's not sufficent for being a humanity, or that some subjects (like the two you mentioned) are both sciences and humanities, but that doesn't affect the fact that math is not a humanity.
Note I didn't say whether math is a science or not. The world isn't disectable into sciences and humanities.
There are already JVMs and Java compilers under the GPL. And Microsoft has already paied the Kaffe people to add support for Microsoft extensions. This isn't going to change much of anything, except make a whole lot of Java stuff easier for Red Hat, Debian and others.