Slashdot Mirror


Why Haven't Special Character Sets Caught On?

theodp asks: "Almost forty years after Kenneth Iverson's APL\360 employed neat Selectric hacks to implement Special Character Sets to express operators with a single symbol, we're still using clunky notation like '<>', '^=', or 'NE' to represent inequality and cryptic escape sequences like '\n' to denote a new line, even though the Mac brought GUI's to the masses more than twenty years ago. Why?"

10 of 117 comments (clear)

  1. Why? by wowbagger · · Score: 5, Insightful
    Why are we not using characters that are:
    1. Hard to generate on a standard keyboard
    2. Not standardized in the specifications of the language.
    3. Not standardized in the character sets of most non-bitmapped displays.
    4. Not standardized in HTML markup.


    Gosh, I don't know!

    Now, if you will excuse me, I need to create a local variable named <The Symbol for the Artist Formerly Known as "The Artist Formerly Known As Prince">
  2. Input method simplicity by Kelson · · Score: 3, Insightful

    In programming? Most languages seem to be designed with ASCII in mind, so you have to stick with what's available there.

    In general? I think it's a matter of input methods. Give me an input method where it takes only two keystrokes to type "" and I'll use it instead of "NE" or "". If I need to use a vulcan death grip, remember a code, or find it in a character map, I'm only going to bother when I have motivation: either making a point, like earlier in this paragraph, or making a polished document. Why go to the effort in a casual email, or a forum post, when it's much easier to type "" instead?

  3. Argh! Here's another reason! by Kelson · · Score: 4, Funny

    I entered an actual not-equal sign in that post, and Slashcode stripped it out!

  4. Take a step back and look at this question again by LeninZhiv · · Score: 4, Insightful

    \n is cryptic and APL isn't?

    I'd say it's more a question of 'choose your poison'. There is a learning curve whether one aims at mathematics-based notation schemes or historical computer science notations, and the market has already chosen (30 years ago) which one it prefers.

    And not without cause. Human language looks a lot more like modern programming languages than mathematical notation, and a major goal of programming language design is to make it as straightforward as possible to tell the computer what you want it to do. One might object that by that argument Cobol is better than C, but humans, especially experts working in a specific domain, like abbreviations too. Cobol is hated because it doesn't allow you to abbreviate, not because it is hard to read, after all. APL or other such specialised syntaxes are hard to read and they don't fit closely enough with the way non-mathematicians think to be intuitive.

  5. Listen to me by Profane+MuthaFucka · · Score: 5, Interesting

    Now sonny, sit down a second and listen to grandpa rant about the good old days. The truth is, when I talk about the good old days, it's not because the days were actually good. It's because I have a sucky memory and questionable taste.

    Now it is TRUE that I once did do programming in APL. This was on an old Zenith 8088 based PC clone with 640K of memory, a CGI display, and a 20 meg hard drive. The system itself worked rather well. If you could work a line editor, the development environment was all you could want. The problem was all the little stickers that went on the keys. Every key mapped to about three other symbols besides the normal ones, and just about every key had a little sticker on it. It was NOT fun. Just because your computers can display characters that look like Chinese doesn't mean that it's a good idea.

    --
    Fascism trolls keeping me up every night. When I starts a preachin', he HITS ME WITH HIS REICH!
  6. Simple by fm6 · · Score: 4, Insightful
    Same reason the Dvorak keyboard has never caught on -- nobody wants to learn to type all over again.

    Display was never the issue with APL. There are implementations of APL that use keywords instead of symbols. It's just that turning everything into an operator makes for really dense, hard-to-maintain code.

    I'm reminded of Forth, which lacks APL's weird symbols, but shares its reputation for dense code. In its heyday, Forth programmers justified using it by claiming it made them more productive. And that's true — if you define "productivity" as "number of lines of new code hacked out per day". But code isn't just written, it's maintained, and dense languages are not maintenance friendly.

  7. it is really a deep one. by torpor · · Score: 4, Insightful

    {disclaimer: i'm a closet fontographer.}

    i've thought about this question since 1978, as i have encountered over the years since then a grand litany of different ways of describing symbols in such a way that they can be standardly used, and i have come to a very simple answer. humans are stuck on a symbol treadmill with infinitely smooth bearings.

    fontography is a lesson of symbols .. and the description of these symbols is limited by strict hardware limits: economic, social, cultural elements all have a part to play in the definition of input devices. where i say QWERTYZXCV, you say QWERTZYXCV.

    we haven't seen terribly wide-spread specialization of symbols because of the producer-/consumer- cults of USKEY101, and peoples unfamiliarity with alt-numkeypad chops, and Mac vs. PC, and ASCII vs. UTF-8, and XML vs. .bin, and "X" vs. "Y", blah blah, ad infinitum..

    the fact is, perhaps deep down inside we know we should be grateful for what we've got, and let the "!=" and ">=" expressions, 2 lonely bytes in a vast nasty sea, stand as testament to the human desire to at least, a little bit, get along on the same key. they may not be pretty, but pretty much everyone can get to those two bytes and use them when they need to .. its only a tiny clique can do the alt-numpad thing, and even fewer who choose to jump out of the ASCII pool and towel off..

    --
    ; -- the corruption of government starts with its secrets. a truly free people keep no secrets. --
  8. Lowest common demoninator by Craig+Maloney · · Score: 4, Insightful

    It's pretty simple: Lowest common denominator. Creating special character sets creates incompatibilities with other machines out there. That's why ASCII was such a boon, and why character sets like PETASCII, ATASCII, and others fell by the wayside. (And if you really want some character set fun, try EBCDIC sometime).

  9. And music is hard to read for non-musicians by Quarters · · Score: 3, Insightful

    \r\n, =, !=, etc... make sense to programmers. They understand the language. Just like the design of 32nd, 16th, 8th, 1/4, 1/2, and whole notes, along with extra notation to modify their true length of play and volume, makes sense to musicians. Why waste time and effort to make it readable for the masses when the masses probably don't care? If they did they'd learn to read the language.

  10. Re: \n as newline by some+guy+I+know · · Score: 3, Insightful
    And for non-visual characters like 'newline'.... what other idea, exactly, did you have?
    How about U+2424?
    Actually, that's the symbol for a graphic representing a newline (a slightly raised N next to a slightly lowered L, shrunk and crammed together into an area approximately a single em-space wide), so maybe that's not such a good idea (as how would you represent the graphic itself in a string?).
    OTOH, a \ followed by U+2424 could better represent a newline graphically in a string.

    The reason that \n seems "pretty straightforward" is that most of us are used to it.
    The concept of backslash followed by a letter representing a control character started in C in the 1960s (or possibly even in earlier languages), and has been copied into dozens of other languages, along with other things like using % in printf strings to format variables (although some languages, like Ruby, are starting to offer alternative representations to %).
    Note that, in Common LISP, a newline is represented by ~% and ~& in formatting strings, and #\Newline (spelled just that way) represents a newline character outside of formatting strings.
    In Object Pascal/Delphi, a newline is represented by its decimal or hexadecimal equivalent, #10 or #$0A.
    Some languages, like Python and sh/ksh/bash/etc., allow an actual newline in a string itself, so no representation is necessary (although Python allows \n as well, in its non-raw strings).
    Other representations that I have seen in the past include ^J and ^M^J (for line feed and carriage return/line feed as control characters) and $ (for end-of-line in regular expressions (although the $ doesn't (usually) match the actual newline itself)) and in "list" mode in vi.
    --
    Those who sacrifice security to condemn liberty deserve to repeat history or something. - Benjamin Santayana