You really don't get the point do you.
"CopyLion" might sound weird to you, but who's to judge?
But that's beside the point. As a Hong Konger I'm lucky enough to have my English name added to my HK ID card. A lot of others don't -- my wife's for example. Everybody knows her by her English name. Most don't know her Chinese name, yet that's the only name on her identification papers. What Google is saying is beyond what their TOS required of people. Do you see the problem?
The funny thing is, for a short while today Baidu's search results also became less censored.
Feel free to draw your own conclusions on what this means;)
Furthermore, as a company registered in Hong Kong, Yahoo! HK falls under Hong Kong jurisdiction, where there are laws regarding privacy such as the Personal (Data) Privacy Ordinance. Some info here. In fact Hong Kong's Privacy Commissioner Office is currently investigating Yahoo! HK on whether it has breached any HK laws.
It's not as if Google is simply editting the whole event out of its index. Note the line on the bottom of the 2nd page you linked to:
Rough translation:
According to local laws and policies, some search results are not shown.
This line does not appear in all search results. At least Google is letting people know which search terms are being censored. That to me has to be better than simply removing all traces of the event, a la real censorship.
If people cannot see and cannot write, they would just turn back immediately. At this rate the future of linux in China/TW/HK shouldn't be optimistic.
No need to be pessimistic. We've passed the point of "cannot read and cannot write" a very long time ago. The problem right now is of improving the user experience and usability of things. In fact I'd go so far as to say that the current state-of-the-art CJK experience on Linux desktops already surpass that of Windows XP.
The firefly-arphic fonts have legal issues and will never be accepted in Debian proper unless they are cleared;
At 10-12px, the ideal would be hand-tuned bitmaps for each of the tens of thousands of characters. Problem can be sidestepped by having larger default font sizes and/or better antialias and autohinting algorithms (these are being worked on), a la OS X/Aqua;
While the design of IIIMF is excellent (disclaimer: I am the one mentioned in the PR who is on OpenI18N's SC), due to its unconventional design it has the reputation of being unapproachable by input method writers;
GB18030 has the largest defined character set (at least the same as Unicode if not larger) and is the Chinese standard. Products are not allowed to be sold, period, unless they have GB18030 support, and that includes having a font with all the characters.
There's a new input method system called Internet/Intranet Input Method Framework (IIIMF). It was released to the free software community by Sun just over 2 years ago. Currently it's hosted at Li18nux.
Among its advantages over the old X Input Method (XIM) system are:
Not tied to X Window anymore. It should be possible to write an IIIMF client for a console app. In fact there's a sample client implementation for Emacs.
Not tied to the old locale/encoding model; everything is in Unicode. For example it is possible to enter Chinese in en_US locale.
Being Sun, IIIMF uses a client/server model. Theoretically an IIIMF client can access an IIIMF server on a Beowulf cluster...
Disclaimer: I am a voting member on the Li18nux Steering Commitee, and I'm also working on a commercial Chinese IIIMF input method for my employer.
Re:The problem with ANY packaging system....
on
Is RPM Doomed?
·
· Score: 3, Insightful
(Perhaps I should point this out earlier: I'm a Debian Developer, so consider myself biased.)
Yes,.deb alone can't solve this problem; but in cases like these the Debian Policy has some guidelines.
Suppose that Maynard has package libPease.1.4.2.thursday.5-31-41.1-pl3-build6 installed, which is supposed to be back-compatible to package libPease1. When he builds his.debs, he mistakenly builds it with a dependance upon libPease.1.4.2.thursday.5-31-41.1-pl3-build6, rather than libPease1.
In this case, "libPease.so.1.4.2.thursday.5-31-41.1-pl3-build6" should be in the package "libpease1", version "1.4.2.thursday.5-31-41.1-pl3-build6". Other packages always do a Depends: libpease1.
The reason that the major soname is in the package name itself is because, binary API changes are supposed to happen when the major soname changes. This way, there might be a "libPease.so.1.xxx" and a "libPease.so.2.xxx" that are binary incompatible but can coexist together on a system; and so there will be "libpease1"
and "libpease2" packages that can be installed together; but "libpease1" version 1.5 will replace "libpease1" version 1.4.2 during upgrade, because upstream says they're binary compatible.
Same problem. Only if your packaging system does not allow subversions of a package can you avoid this problem. And if your package does not allow subversions, then if I really do need a feature of libPease.1.4 or later I am screwed - I cannot spell that out in the packaging system, so somebody will install my package when they only have libPease.1.0. Then I have to tell them at runtime they don't have the correct package.
As long as the binary API remains backwards compatible, then the "libpease1" package can be upgraded to 1.4, and packages that require 1.4 features can Depends: libpease1 (>= 1.4). If libPease.so.1.4 is not binary compatible with libPease.so.1.0, then it really should be called libPease.so.2.0. If it isn't, then upstream has stuffed up, so nag upstream about it (I've done it before).
Additionally, SuperFlyFloobyDust might NOT really NEED the functions of the bastard version, and so even if libPease.1.5 could correctly state "No, I am not a total replacement for that bastard version", SuperFlyFloobyDust could actually run on libPease.1.5, but due to being packaged by an incompetent boob, the program won't install.
This is the problem of the person who did the package, yes? She should test the package before releasing it to the world, just like any other software, whether in source code or binary form (especially in binary form).
No system can guard against incompetent packagers. But with RPM's file dependencies, it's much, much easier to make a mess.
Re:The problem with ANY packaging system....
on
Is RPM Doomed?
·
· Score: 2
When Maynard builds his SuperFlyFloobyDust.rpm file, rather than specifying the dependancies as "I need libPease.so", he accepts the default "I need libPease.1.4.2.thursday.5-31-41.1-pl3-build6.so". So, even though any libPease.so would work, you get a dependancy failure.
This is a failing not of any specific package manager - ALL package managers have this problem. You don't see it with.debs not because of any inherent superiority of.deb, but rather because of the hard work of the Debian maintainers to make sure the packages are all set up correctly!
Actually.deb does not allow file dependencies -- only package dependencies are allowed. So if a package needs "libPease.so.1", it will Depends: libpease1, not on the actual library file.
File dependecies makes RPM-based systems so much more unmaintainable that, in fact, the LSB forbids them.
Debian's advantage lies not in the packaging format. Technically there's not much difference between rpm and dpkg.
What makes Debian stand out is the Debian Policy, which all Debian Developers must adhere to. Theoretically someone can apply that Policy to an RPM-based system, and it'll be as stable as Debian.
Re:There are no privacy issues whatsoever.
on
Hong Kong's Octopus
·
· Score: 1
There are several banks offering this service (Automatic Add Value) now; at least Hang Seng (a very major bank) now does.
The computer industry is still strongest in the US, and most OS software is still written by US-based companies. Why don't some Chinese software developers come up with their own language standard and write a bunch of software with it?
The same reason that you can't get a name brand PC without Windows preloaded.
Just check out the latest Nokia phones sold in Asia to see how easy it is to type Chinese SMS messages. Or how about the input method they use on those ESDlife terminals around Hong Kong. Both uses less than a dozen keys to enter Chinese, with no need for prior training.
Simplified and Traditional Chinese share a lot of similarities. Even the simplified writings of a particular character often look nearly the same as the traditional one. Thus, the encoding for these two can be unified, only the font bitmap is different.
You can't do that, since very often a simplified character maps to several traditional characters. Even if you can, it won't be a saving of 50,000 characters, only several thousand at best.
Neither the HTTP headers sent by Slashdot nor the preamble of the HTML file specify a character encoding. Therefore the encoding is the default encoding, i.e. ISO-8859-1 (aka latin-1).
Apparently not, according to the HTML 4.0 standard. Section 5.2.2 Specifying the character encoding states (italics mine):
The HTTP protocol ([RFC2068] [p.324] , section 3.7.1) mentions ISO-8859-1 as a default character encoding when the "charset" parameter is absent from the "Content-Type" header field. In practice, this recommendation has proved useless because some servers don t allow a "charset" parameter to be sent, and others may not be configured to send the parameter. Therefore,
user agents must not assume any default value for the "charset" parameter.
And it goes on to say that
... the user agent may use heuristics and user settings. For example, many user agents use a heuristic to distinguish the various encodings used for Japanese text. Also, user agents
typically have a user-definable, local default character encoding which they apply in the absence of other indicators.
User agents may provide a mechanism that allows users to override incorrect "charset" information.
Now, the "local default character encoding" might very well be ISO 8859-1 for you, but certainly not for me.
The point is, there is no single default encoding for the web.
I agree that Netscape is massively broken though -- both IE and Mozilla are much better.
CDDB is not a good example of a properly i18n'd program/web site.
There's actually a way in HTML to tell browsers what encoding was used on a web page. For example, to specify that a web page is encoded in Big5, the usual encoding for Traditional Chinese, put the following in the HTML header: <meta http-equiv="Content-Type" content="text/html; charset=big5">
Yes, that pretty much means that only one encoding should be used within a page, but it's better than nothing.
Both IE and Mozilla have "language auto-detection" (really it's encoding auto-detect) that works most of the time.
And before anyone says "Unicode would solve all the problems!", let me tell you that even Unicode 3.0 doesn't include all of the characters in common use around the Greater China region, let alone around the world.
But I bet myself $0.05 that it will not show
up correctly and may even stuff the rest of the posts.
Under Linux, using Netscape 4.75, select View -> Encoding -> Japanese (Auto Select). Works perfectly here.
I agree that many old software can't handle non-English languages well, but modern ones are likely to have no trouble with them. (GTK/GNOME ones are a good example.)
Whenever you do something the GFW doesn't like, it will inject TCP RST packets to kill your TCP session(s). They've been doing this for years.
You really don't get the point do you. "CopyLion" might sound weird to you, but who's to judge? But that's beside the point. As a Hong Konger I'm lucky enough to have my English name added to my HK ID card. A lot of others don't -- my wife's for example. Everybody knows her by her English name. Most don't know her Chinese name, yet that's the only name on her identification papers. What Google is saying is beyond what their TOS required of people. Do you see the problem?
Go read the charter yourself. Where the hell does it promote radical capitalism?
The funny thing is, for a short while today Baidu's search results also became less censored. Feel free to draw your own conclusions on what this means ;)
Yeah and this is what happens. Doesn't work.
Wrong website. You should look in the Yahoo! HK site instead. Specifically, Yahoo! HK's Privacy Policy.
Furthermore, as a company registered in Hong Kong, Yahoo! HK falls under Hong Kong jurisdiction, where there are laws regarding privacy such as the Personal (Data) Privacy Ordinance. Some info here. In fact Hong Kong's Privacy Commissioner Office is currently investigating Yahoo! HK on whether it has breached any HK laws.
It's not as if Google is simply editting the whole event out of its index. Note the line on the bottom of the 2nd page you linked to:
Rough translation:
According to local laws and policies, some search results are not shown.
This line does not appear in all search results. At least Google is letting people know which search terms are being censored. That to me has to be better than simply removing all traces of the event, a la real censorship.
No need to be pessimistic. We've passed the point of "cannot read and cannot write" a very long time ago. The problem right now is of improving the user experience and usability of things. In fact I'd go so far as to say that the current state-of-the-art CJK experience on Linux desktops already surpass that of Windows XP.
To give the problems a bit more perspective:
You might want to have a look at the Linux Standard Base for a widely supported binary executable format standard.
Great, the representative for the electorate I'm in (Higgins, VIC) is none other than Peter Costello, the current Federal Treasurer.
I doubt writing to him would do anything.
There's a new input method system called Internet/Intranet Input Method Framework (IIIMF). It was released to the free software community by Sun just over 2 years ago. Currently it's hosted at Li18nux.
Among its advantages over the old X Input Method (XIM) system are:
Disclaimer: I am a voting member on the Li18nux Steering Commitee, and I'm also working on a commercial Chinese IIIMF input method for my employer.
(Perhaps I should point this out earlier: I'm a Debian Developer, so consider myself biased.)
Yes, .deb alone can't solve this problem; but in cases like these the Debian Policy has some guidelines.
In this case, "libPease.so.1.4.2.thursday.5-31-41.1-pl3-build6" should be in the package "libpease1", version "1.4.2.thursday.5-31-41.1-pl3-build6". Other packages always do a Depends: libpease1.
The reason that the major soname is in the package name itself is because, binary API changes are supposed to happen when the major soname changes. This way, there might be a "libPease.so.1.xxx" and a "libPease.so.2.xxx" that are binary incompatible but can coexist together on a system; and so there will be "libpease1" and "libpease2" packages that can be installed together; but "libpease1" version 1.5 will replace "libpease1" version 1.4.2 during upgrade, because upstream says they're binary compatible.
As long as the binary API remains backwards compatible, then the "libpease1" package can be upgraded to 1.4, and packages that require 1.4 features can Depends: libpease1 (>= 1.4). If libPease.so.1.4 is not binary compatible with libPease.so.1.0, then it really should be called libPease.so.2.0. If it isn't, then upstream has stuffed up, so nag upstream about it (I've done it before).
This is the problem of the person who did the package, yes? She should test the package before releasing it to the world, just like any other software, whether in source code or binary form (especially in binary form).
No system can guard against incompetent packagers. But with RPM's file dependencies, it's much, much easier to make a mess.
Actually .deb does not allow file dependencies -- only package dependencies are allowed. So if a package needs "libPease.so.1", it will Depends: libpease1, not on the actual library file.
File dependecies makes RPM-based systems so much more unmaintainable that, in fact, the LSB forbids them.
Debian's advantage lies not in the packaging format. Technically there's not much difference between rpm and dpkg.
What makes Debian stand out is the Debian Policy, which all Debian Developers must adhere to. Theoretically someone can apply that Policy to an RPM-based system, and it'll be as stable as Debian.
There are several banks offering this service (Automatic Add Value) now; at least Hang Seng (a very major bank) now does.
I'm surprised that nobody has mentioned this...
Octupus card
It's a card/watch that everyone in Hong Kong carries nowadays. I could pretty much go on for a day just on that card:
Carrying the Octopus means I no longer need to carry those little coins and small notes.
Yes, the system is centralized, but if you're worried about privacy, at least you can get anonymous cards.
Perhaps you're after apt-get build-dep whatever ?
The same reason that you can't get a name brand PC without Windows preloaded.
Just check out the latest Nokia phones sold in Asia to see how easy it is to type Chinese SMS messages. Or how about the input method they use on those ESDlife terminals around Hong Kong. Both uses less than a dozen keys to enter Chinese, with no need for prior training.
You can't do that, since very often a simplified character maps to several traditional characters. Even if you can, it won't be a saving of 50,000 characters, only several thousand at best.
The point is, there is no single default encoding for the web.
I agree that Netscape is massively broken though -- both IE and Mozilla are much better.
- CDDB is not a good example of a properly i18n'd program/web site.
- There's actually a way in HTML to tell browsers what encoding was used on a web page. For example, to specify that a web page is encoded in Big5, the usual encoding for Traditional Chinese, put the following in the HTML header:
- Both IE and Mozilla have "language auto-detection" (really it's encoding auto-detect) that works most of the time.
And before anyone says "Unicode would solve all the problems!", let me tell you that even Unicode 3.0 doesn't include all of the characters in common use around the Greater China region, let alone around the world.<meta http-equiv="Content-Type" content="text/html; charset=big5">
Yes, that pretty much means that only one encoding should be used within a page, but it's better than nothing.
I agree that many old software can't handle non-English languages well, but modern ones are likely to have no trouble with them. (GTK/GNOME ones are a good example.)
But Yahoo isn't a good example of where Hong Kong people trade phone numbers though - they're often traded in mobile phone malls.