Slashdot Mirror


Why Netscape shows ? instead of '

RandySC writes " Demoronizer is a Perl program which corrects numerous errors and incompatibilities in HTML generated by, or edited with, Microsoft applications. The demoroniser keeps you from looking dumber than a bag of dirt when your Web page is viewed by a user on a non-Microsoft platform. A little detective work revealed that, as is usually the case when you encounter something shoddy in the vicinity of a computer, Microsoft incompetence and gratuitous incompatibility were to blame. Western language HTML documents are written in the ISO 8859-1 Latin-1 character set, with a specified set of escapes for special characters. Blithely ignoring this prescription, as usual, Microsoft use their own "extension" to Latin-1, in which a variety of characters which do not appear in Latin-1 are inserted in the range 0x82 through 0x95--this having the merit of being incompatible with both Latin-1 and Unicode, which reserve this region for additional control characters. " So now we know what happened to Jon.

1 of 104 comments (clear)

  1. Blame W3C, too... by Craig · · Score: 2
    OK, Microsoft should be condemned for their HTML-crunching products for using characters in that numeric range. (They should also be condemned for hard-coding absolute font sizes instead of +1, -2 etc.)

    But the HTML spec has a glaring lack that motivates this violation in the first place: no curved quotes and apostrophes, and no em-dash.

    Now, HTML is supposed to display by default in a proportional font, like printed matter (it's easier to read, among other advantages). But proportional fonts always use curved, symmetric double and single quotes.

    Likewise proportional fonts always distinguish between a hyphen and a dash; most, in fact, have two dashes (the endash and the emdash) of slightly different widths, in addition to the hyphen.

    But the HTML spec (and ISO8859-1) assumes the broken ASCII/Typewriter usage, which in proportional fonts is jarring and ugly. Font specs should be designed by people who know something about fonts, not by engineers!

    The situation is potentially worse in other languages, though I'm not sure how the other ISO-8859-x specs handle it. In German, for example, the opening double and single quotes are traditionally at the bottom of the print line rather than the top, in addition to being reverse-curved, and French uses "guillaumettes", which look like doubled marks.

    Search the Web for things like ampersand-emdash-semicolon and ampersand-lquot-semicolon -- which are attempts to address the problem -- and you'll see that this gaping mistake in HTML/ISO8859-1 bothers a lot of people.

    So yeah, blame Microsoft for a kluge that works on only maybe four out of five of the web-surfing PCs out there. But complain to the ISO and to W3C for their oversight, too.

    Craig