roozbeh · Slashdot Mirror

Re:The HIC talk campaign goes on and on.. on Iran: Even If Windows Is Free, Linux Is Preferred · 2004-09-22 05:45 · Score: 2, Informative

Well, it's called HCI, not HIC. It's called "Request for Proposals", not "Request for Comments".

Anyway, I was among the original supporters and architects of the HCI Persian Linux (FarsiLinux) effort, but it's now far from under any kind of influence from me, and I don't approve most of their actions, I even agree that they don't understand the whole notion properly yet. But it has good effects, specially when they provide funds to companies who loved to work on Linux but couldn't hire good developers. They also have the courage to recommend Linux to the government and the corporations, which helps the evangelization effort. Just look at their home page (top left). Which government organization in the world has the courage to put a Tux logo on their first page?

The history you are mentioning, is partially false and partially incomplete. Just some examples:

The whole effort of localizing Linux to Persian in a standard way was started before any company was interested in the matter, by Sharif University of Technology's computing center.
The keyboard layout you are mentioning, which I assume is the one in XFree86 (latest version here), is not designed by any company. It's based on the Iranian standard ISIRI 2901, funded by the same HCI in 1998. It was I who provided the information to Rubert Brady, who then worked for SuSE, as you can see in the file's header. You can also see my Sharif email address there.
The Windows keyboard layout is a mess, yes, simply because they did not have any contact with Iranian experts to tell them about the national standard, which was developed by HCI. HCI has already agreed to the layout, of course, or otherwise why should have them published it back in 1998?
Shabdix, the distribution you are talking about, is actually Knoppix-based. HCI is also funding the Chapar Shabdiz company, the distributors of Shabdix, for their release 1.0. I don't recall the exact amount, but it was more than USD 25,000.
You are mentioning that HCI is defining projects for adding UTF-8 support in Qt and GTK+. That's not so. They are asking for proper internationalization and localization of such programs and libraries. Some examples are: user-friendly bidirectional editing and display (which is very hard), proper display of Persian numbers (which use different shapes than common European ones known in the world as Arabic), proper support of Iranian calendar, etc.
You are claiming that Chapar Shabdiz was the "only" producer of "actual code". Please show me the code generated by them, and compare it with the amount of code created by Sharif people (GNU FriBidi is just an example, co-maintained by me, used in Abiword and GNOME, and included in many distribution including Fedora and Mandrake). As far as I can tell, there is only one piece of code included in international Linux distributions created by Chapar Shabdiz, and that is the Iranian calendar support in KDE's PIM.

SCO Unix in Iran on SCO Amends Suit, Clarifies "Violations", Triples Damages · 2003-06-17 03:41 · Score: 4, Interesting

The suit also adds illegal export issues stemming from the worldwide availability of open-source software. SCO claims IBM has breached its contract by making multiprocessor operating system technology available "for free distribution to anyone in the world," including residents of Cuba, Iran, Syria, North Korea and Libya, countries to which the United States controls exports. The open-source technology IBM released "can be used for encryption, scientific research and weapons research," the suit said.

Guess what? SCO Unix is already used widely in Iran. I can confirm it. I live in Iran.

So perhaps it's SCO itself that is breaking the US export regulations.

Re:SMS Spam in Iran on SMS SPAM to be Banned Down Under? · 2003-06-15 19:26 · Score: 1

Well, we have a famous religious MP, from the time of Shahs (now deceased, of course), whose face is printed on IRR 100 notes. A quote accompanies the face: "Our religion is the same as our politics, and our politics is the same as our religion".

SMS Spam in Iran on SMS SPAM to be Banned Down Under? · 2003-06-15 08:22 · Score: 1

Hello, this is Iran, and it's just a few months since we have SMS in our network, which is a government monopoly. Ok, so you think there will be no need for a law to restrict netword operators from passing spam? No, you're wrong! The government operator itself SPAMs us. But what do you have to advertise? It's government after all. Hmm... Well, it's a religious government, and we've got all those kind of prophets and saints. On the birthday or the death day of each, we receive spam celebrating or mourning each of them, using high school essays Persian transcribed horribly into Latin!

... as in DeVice Independent? on High Resolution DVI Support for Plasma Displays? · 2002-08-09 17:15 · Score: 1

This really scared me at the first look. DVI standed for something named DeVice Independent at good old days, when everyone still used TeX to format her thesis. Oh, just how sweet were the days, when you knew every nice girl will someday need you, the local TeXpert.

Now we're back to troff again...

and finally VIM is Free Software on Vim's Bram Moolenaar On Open Source And Vim 6.0 · 2002-01-02 22:51 · Score: 1

For a long time, RMS did not accept Vim's license as Free Software. But just recently, he accepted it. You can find it here:

"After further consideration, and discussions with some people whose advice I rely on, I concluded that the Vim license does qualify as a free software license. Its requirements don't go as far as the ones that we rejected many years ago."

Donald Knuths's Way... on E-mail Overload: Welcome Back to School · 2001-09-04 03:29 · Score: 4, Interesting

...is not reading email anymore. Read it at Knuth versus Email.

'I don't even have an e-mail address. I have reached an age where my main purpose is not to receive messages.'
--- Umberto Eco, quoted in the New Yorker

Harry Potter pr0n on Harry Potter Wins Hugo · 2001-09-03 06:53 · Score: 1

I came to a MSNBC story about Harry Potter pr0n some days ago. You can read it at: http://www.msnbc.com/news/621503.asp

Re:But what about .EDU? on Update On Efforts To Block .us Giveaway · 2001-07-23 03:25 · Score: 1

Why do you need an .EDU in iran? Don't they kill internet users there?

Well, not exactly killing, they only make it a hard time for us, which some may consider worse. The problem is that we need to change the situation:

The current .ir registrar has a really hard mechanism for registering a domain, and for changing that, we need to tell them that Iranians will just go and register some .com or .net if they don't open the registry, and they will lose lots of money, among other things.

Global domains are good for that, the government can't restrict you with its weird policies. With one less global TLD, Internet content providers in a country like ours should go kill themselves if not killed by them. ;-)

--

But what about .EDU? on Update On Efforts To Block .us Giveaway · 2001-07-23 02:16 · Score: 4

I wonder what will happen to .edu: As outlined in the RFC 1591, the TLD belongs to the global community of educational insitutes, and not only Americans:

EDU - This domain was originally intended for all educational institutions. Many Universities, colleges, schools, educational service organizations, and educational consortia have registered here. More recently a decision has been taken to limit further registrations to 4 year colleges and universities. Schools and 2-year colleges will be registered in the country domains (see US Domain, especially K12 and CC, below).

But according to this Slashdot article, the US Department of Commerce gave it away to something named EDUCAUSE, that doesn't let universities outside USA to get a .EDU.

As a user of a .edu here in Iran, that really aches...

--

Re:Original article was just ignorant FUD on Why Unicode Will Work On The Internet · 2001-06-09 05:22 · Score: 1

This reply is posted there before publication here, and has a "4: Informative". The reader will not miss it if he cares enough.

--

Re:Glyphs versus characters in Castillian on Why Unicode Will Work On The Internet · 2001-06-09 05:14 · Score: 3

In Unicode terms, "ch" is named a grapheme, it's different from a character. (Or you may want to call it a letter.) it is encoded using the two characters "c" and "h". It is something that considered a unit in some places, but not in the others. I would recommend taking a look at the Unicode Standard book, which you can read online. This things are in chapters1 and 2.

About string ordering, Unicode does not claim anything. If you look into ASCII, you will find that even that is not suitable for normal English sorting, since "B" is encoded before "a". But don't go away. Unicode has a Collation Algorithm that specifies what should one do with advanced natural language ordering of strings, and also tells what should one do with the Castillian "ch".

--

Re:yes, unicode works, but is unnecessary. on Why Unicode Will Work On The Internet · 2001-06-09 04:59 · Score: 1

Also, if there's redundancy in Unicode, I imagine most of that space could be saved with gzip, which also has good support over the web, though like Unicode is far underused.

Well, one may also try the Standard Compression Scheme for Unicode.

--

Re:yes, unicode works, but is unnecessary. on Why Unicode Will Work On The Internet · 2001-06-09 04:55 · Score: 1

unicode works, but is unnecessary

It is necessary extended scripts, like Persian which is somehow an extended Arabic script, and many of the minor scripts of the world, like Syriac.

I haven't seen a homepage in Unicode yet.

Then see my homepage!

--

Unicode's reply on Why Unicode Won't Work on the Internet · 2001-06-05 07:49 · Score: 4

It's probably too late, but following is a reponse from on of the editors of the Unicode Standard:

Dear Mr. Carroll,

I have just finished reading the article you published today on the Hastings Research website, authored by Norman Goundry, entitled "Why Unicode Won't Work on the Internet: Linguistic, Political, and Technical Limitations."

Mr. Goundry's grounding in Chinese is evident, and I will not quibble with his background East Asian historical discussion, but his understanding of the Unicode Standard in particular and of the history of Han character encoding standardization is woefully inadequate. He make a number of egregiously incorrect statements about both, which call into question the quality of research which went into the Unicode side of this article. And as they are based on a number of false premises, the article's main conclusions are also completely unreliable.

Here are some specific comments on items in the article which are either misleading or outright false.

Before getting into Unicode per se, Mr. Goundry provides some background on East Asian writing systems. The Chinese material seems accurate to me. However, there is an inaccurate statement about Hangul: "Technically, it was designed from the start to be able to describe *any sound* the human throat and mouth is capable of producing in speech, ..." This is false. The Hangul system was closely tied to the Old Korean sound system. It has a rather small number of primitives for consonants and vowels, and then mechanisms for combining them into consonantal and vocalic nuclei clusters and then into syllables. However, the inventory of sounds represented by the Jamo pieces of the Hangul are not even remotely close to describing any sound of human speech. Hangul is not and never was a rival for IPA (the International Phonetic Alphabet).

In the section on "The Inability of Unicode To Fully Address Oriental Characters", Mr. Goundry states that "Unicode's stated purpose is to allow a formalized font system to be generated from a list of placement numbers which can articulate *every single written language* on the planet." While the intended scope of the Unicode Standard is indeed to include all significant writing systems, present and past, as well as major collections of symbols, the Unicode Standard is *not* about creating "formalized font systems", whatever that might mean. Mr. Goundry, while critiquing Anglo-centricity in thinking about the Web and the Internet as an "unfortunate flaw in Western attitudes" seems to have made the mistake of confusing glyph and character -- an unfortunate flaw in Eastern attitudes that often attends those focussing exclusively on Han characters.

Immediately thereafter, Mr. Goundry starts making false statements about the architecture of the Unicode Standard, making tyro's mistakes in confusing codespace with the repertoire of encoded characters. In fact the codespace of the Unicode Standard contains 1,114,112 code points -- positions where characters can be encoded. The number he then cites, 49,194, was the number of standardized, encoded characters in the Unicode Standard, Version 3.0; that number has (as he notes below) risen to 94,140 standardized, encoded characters in the *current* version of the Unicode Standard, i.e., Version 3.1. After taking into account code points set aside for private use characters, there are still 882,373 code points unassigned but available for future encoding of characters as needed for writing systems as yet unencoded or for the extension of sets such as the Han characters.

*Even if* Mr. Goundry's calculation of 170,000 characters needed for China, Taiwan, Japan, and Korea were accurate, the Unicode Standard could accomodate that number of characters easily. (Note that it already includes 70,207 unified Han ideographs.) However, Mr. Goundry apparently has no understanding of the implications or history of Han unification as it applies to the Unicode Standard (and ISO/IEC 10646). Furthermore, he makes a completely false assertion when he states that Mainland China, Taiwan, Korea, and Japan "were not invited to the initial party."

Starting with the second problem first, a perusal of the Han Unification History, Appendix A of the Unicode Standard, Version 3.0, will show just how utterly false Mr. Goundry's implication that the Asian countries were left out of the consideration of encoding of Han characters in the Unicode Standard is. Appendix A is available online, so there really is no valid research excuse for not having considered it before haring off to invent nonexistent history about the project, even if Mr. Goundry didn't have a copy of the standard sitting on his desk. See:

http://www.unicode.org/unicode/uni2book/appA.pdf

The "historical" discussion which follows in Mr. Goundry's account, starting with "The reaction was predictable..." is nothing less than fantasy history that has nothing to do with the actual involvement of the standardization bodies of China, Japan, Korea, Taiwan, Hong Kong, Singapore, Vietnam, and the United States in Han character encoding in 10646 and the Unicode Standard over the last 11 years.

Furthermore, Mr. Goundry's assertions about the numbers of characters to be encoded show a complete misunderstanding of the basics of Han unification for character encoding. The principles of Han unification were developed on the model of the main *Japanese* national character encoding, and were fully assented to by the Chinese, Korean, and other national bodies involved. So assertions such as "they [Taiwan] could not use the same number [for their 50,000 characters] as those assigned over to the Communists on the Mainland" is not only false but also scurrilously misrepresents the actual cooperation that took place among all the participants in the process.

Your (Mr. Carroll's) editorial observation that "It is only when you get *all* the nationalities in the same room that the problem becomes manifest," runs afoul of this fantasy history. All the nationalities have been participating in the Han unification for over a decade now. The effort is led by China, which has the greatest stakeholding in Han characters, of course, but Japan, Korea, Taiwan and the others are full participants, and their character requirements have *not* been neglected.

And your assertion that many Westerners have a "tendency .. to dismiss older Oriental characters as 'classic,'" is also a fantasy that has nothing to do with the reality of the encoding in the Unicode Standard. If you would bother to refer to the documentation for the Unicode Standard, Version 3.1, you would find that among the sources exhaustively consulted for inclusion in the Unicode Standard are the KangXi dictionary (cited by Mr. Goundry), but also Hanyu Da Zidian, Ci Yuan, Ci Hai, the Chinese Encyclopedia, and the Siku Quanshu. Those are *the* major references for Classical Chinese -- the Siku Quanshu *is* the Classical canon, a massive collection of Classical Chinese works which is now available on CDROM using Unicode. In fact, the company making it available is led by the same man who represents the Chinese national standards body for character encoding and who chairs the Ideographic Rapporteur Group (the international group that assists the ISO working group in preparing the Han character encoding for 10646 and the Unicode Standard).

Mr. Goundry's argument for "Why Unicode 3.1 Does Not Solve the Problem" is merely that "[94,140 characters] still falls woefully short of the 170,000+ characters needed"-- and is just bogus. First of all the number 170,000 is pulled out of the air by considering Chinese, Japanese, and Korean repertoires *without* taking Han unification into account. In fact, many *more* than 170,000 candidate characters were considered by the IRG for encoding -- see the lists of sources in the standard itself. The 70,207 unified Han ideographs (and 832 CJK compatibility ideographs) already in the Unicode Standard more than cover the kinds of national sources Mr. Goundry is talking about.

Next Mr. Goundry commits an error in misunderstanding the architecture of the Unicode Standard, claiming that "two *separate* 16 bit blocks do not solve the problem at all." That is not how the Unicode Standard is built. Mr. Goundry claims that "18 bits wide" would be enough -- but in fact, the Unicode Standard codespace is 21 bits wide (see the numbers cited above). So this argument just falls to pieces.

The next section on "The Political Significance Of This Expressed In Western Terms" is a complete farce based on false premises. I can only conclude that the aim of this rhetoric is to convince some ignorant Westerners who don't actually know anything about East Asian writing systems -- or the Unicode Standard, for that matter -- that what is going on is comparable to leaving out five or six letters of the Latin alphabet or forcing "the French ... to use the German alphabet". Oh my! In fact, nothing of the kind is going on, and these are completely misleading metaphors.

The problem of URL encodings for the Web is a significant problem, but it is not a problem *created* by the Unicode Standard. It is a problem which is being actively worked on my the IETF currently, and it is quite likely that the Unicode Standard will be a significant part of the *solution* to the problem, enabling worldwide interoperability, rather than obstructing it.

And it isn't clear where Mr. Goundry comes up with asides about "Ascii-dependent browsers". I would counter that Mr. Goundry is naive if he hasn't examined recently the internationalized capabilities of major browsers such as Internet Explorer -- which themselves depend on the Unicode Standard.

Mr. Goundry's conclusion then presents a muddled summary of Unicode encoding forms, completely missing the point that UTF-8, UTF-16, and UTF-32 are each completely interoperable encoding forms, each of which can express the entire range of the Unicode Standard. It is incorrect to state that "Unicode 3.1 has increased the complexity of UCS-2." The architecture of the Unicode Standard has included UTF-16 (not UCS-2) since the publication of Unicode 2.0 in 1996; Unicode 3.1 merely started the process of standardizing characters beyond the Basic Multilingual Plane.

And if Mr. Goundry (or anyone else) dislikes the architectural complexity of UTF-16, UTF-32 is *precisely* the kind of flat encoding that he seems to imply would be preferable because it would not "exacerbate the complexity of font mapping".

In sum, I see no point in Mr. Goundry's FUD-mongering about the Unicode Standard and East Asian writing systems.

Finally, the editorial conclusion, to wit, "Hastings [has] been experimenting with workarounds, which we believe can be language- and device-compatible for all nationalities," leads me to believe that there may be hidden agenda for Hastings in posting this piece of so-called research about Unicode. Post a seemingly well-researched white paper with a scary headline about how something doesn't work, convince some ignorant souls that they have a "problem" that Unicode doesn't address and which is "politically explosive", and then turn around and sell them consulting and vaporware to "fix" their problem. Uh-huh. Well, I'm not buying it.

--Ken Whistler, B.A. (Chinese), Ph.D. (Linguistics),
Technical Director, Unicode, Inc.
Co-Editor, The Unicode Standard, Version 3.0

--

Open Sales on Reiser On ReiserFS's Future And More · 2001-05-23 00:19 · Score: 1

You haven't seen his brilliant idea? He will start evangelizing Open Sales:

In open sales, you [as the software licencee - ed.] join a pool [of licensed software] and pay a percentage of your hardware costs to use the software in the pool (if you can buy the hardware you should be able to afford a percentage of it for the software). The licensing software picks 5 random [software] vendors for you to divide your money between in proportion to the value you perceive in it, and your money gets divided between them. All software in the pool should have publicly available source code so that others can add improvements to it. This would fix the problems inherent in both copyright and patent law to a state better than they are now. Users would vote on what the percentage of hardware cost license fee would be, votes weighted according to how much they paid last year.

I wonder what will RMS say if he sees that idea...

But it's a gTLD! on Educational Consortium Will Control .edu Domains · 2001-04-12 17:25 · Score: 1

What every one has failed to notice, is that .edu is a a "global TLD", just like his brethren .com, .org, .net, and their kid brother .int. Just take a look at RFC 1591 for the definitions. This is the section about .edu:

World Wide Generic Domains:
...
EDU - This domain was originally intended for all educational institutions. Many Universities, colleges, schools, educational service organizations, and educational consortia have registered here. More recently a decision has been taken to limit further registrations to 4 year colleges and uiversities. Schools and 2-year colleges will be registered in the country domains (see US Domain, especially K12 and CC, below).

I wonder what will DoC's decision mean to universities outside US who use their .edu address regularly. My university in Iran is just one example. (I registered the .edu domain myself, after a lot of disputes with Network Solutions.)

But they are FSF donors! on Microsoft Clarifies Jim Allchin's Statements · 2001-02-20 13:04 · Score: 1

How can this be when they are FSF donors?! ;)

Just take a look at Thank GNUs page the the FSF homepage, and search for Microsoft on the page. Microsoft Corporation is listed there...

TeX award [was Re: Complexity] on Could LaTeX Replace HTML? · 2000-12-04 18:38 · Score: 1

The award for finding a bug in TeX is not $3.14, but $327.68, and that's not for all bugs. Take a look here, lines47 and 48:

A reward of $327.68 will be paid to the first finder of any remaining bug, not counting changes introduced after August 1989.

3.14159, is the latest version of TeX.

Slashdot Mirror

User: roozbeh

Comments · 19