Slashdot Mirror


Why Haven't Special Character Sets Caught On?

theodp asks: "Almost forty years after Kenneth Iverson's APL\360 employed neat Selectric hacks to implement Special Character Sets to express operators with a single symbol, we're still using clunky notation like '<>', '^=', or 'NE' to represent inequality and cryptic escape sequences like '\n' to denote a new line, even though the Mac brought GUI's to the masses more than twenty years ago. Why?"

15 of 117 comments (clear)

  1. \n cryptic? by Anonymous+Crowhead · · Score: 2, Insightful

    And special characters wouldn't be?

  2. Why? by wowbagger · · Score: 5, Insightful
    Why are we not using characters that are:
    1. Hard to generate on a standard keyboard
    2. Not standardized in the specifications of the language.
    3. Not standardized in the character sets of most non-bitmapped displays.
    4. Not standardized in HTML markup.


    Gosh, I don't know!

    Now, if you will excuse me, I need to create a local variable named <The Symbol for the Artist Formerly Known as "The Artist Formerly Known As Prince">
  3. Input method simplicity by Kelson · · Score: 3, Insightful

    In programming? Most languages seem to be designed with ASCII in mind, so you have to stick with what's available there.

    In general? I think it's a matter of input methods. Give me an input method where it takes only two keystrokes to type "" and I'll use it instead of "NE" or "". If I need to use a vulcan death grip, remember a code, or find it in a character map, I'm only going to bother when I have motivation: either making a point, like earlier in this paragraph, or making a polished document. Why go to the effort in a casual email, or a forum post, when it's much easier to type "" instead?

    1. Re:Input method simplicity by Eric+Giguere · · Score: 2, Insightful

      If I need to use a vulcan death grip

      If you think emacs editing sequences are obscure now, imagine how much more fun they'd be with all those "special characters"...

      If you're a touch typist, you really want to minimize the number of keys you have to press simultaneously to get something done, especially if you can't use hands separately to do it. Typing two or more normal characters together is much easier.

      Eric
      Get some stroller advice here
  4. Take a step back and look at this question again by LeninZhiv · · Score: 4, Insightful

    \n is cryptic and APL isn't?

    I'd say it's more a question of 'choose your poison'. There is a learning curve whether one aims at mathematics-based notation schemes or historical computer science notations, and the market has already chosen (30 years ago) which one it prefers.

    And not without cause. Human language looks a lot more like modern programming languages than mathematical notation, and a major goal of programming language design is to make it as straightforward as possible to tell the computer what you want it to do. One might object that by that argument Cobol is better than C, but humans, especially experts working in a specific domain, like abbreviations too. Cobol is hated because it doesn't allow you to abbreviate, not because it is hard to read, after all. APL or other such specialised syntaxes are hard to read and they don't fit closely enough with the way non-mathematicians think to be intuitive.

  5. Efficient by Threni · · Score: 2, Insightful

    Because we don't need to change for the sake of it, to a system which isn't supported by a lot of software and hardware. Why not just change your software to interpret the characters as an image, like some already does with smilies?

  6. Simple by fm6 · · Score: 4, Insightful
    Same reason the Dvorak keyboard has never caught on -- nobody wants to learn to type all over again.

    Display was never the issue with APL. There are implementations of APL that use keywords instead of symbols. It's just that turning everything into an operator makes for really dense, hard-to-maintain code.

    I'm reminded of Forth, which lacks APL's weird symbols, but shares its reputation for dense code. In its heyday, Forth programmers justified using it by claiming it made them more productive. And that's true — if you define "productivity" as "number of lines of new code hacked out per day". But code isn't just written, it's maintained, and dense languages are not maintenance friendly.

  7. Because of old and crappy software, and laziness by metamatic · · Score: 2, Insightful

    Because standardization of extended character sets, via Unicode, is a relatively recent development. Hence, there's a lot of software around that still doesn't handle Unicode.

    For example, I switched to bash because tcsh didn't cope with Unicode. Mozilla's Unicode support is incomplete--card symbols defined in the HTML 4.01 standard don't show up properly on the Mac, even though it definitely has them in its standard fonts. Many text editors don't support Unicode. And so on.

    In fact, it's only recently that Slashdot was fixed to allow us to use words like "cliché" and enter amounts of money in Pounds Sterling like £5.99, even though those 'special' characters were part of HTML 1.0. Forget about using the aforementioned card symbols on Slashdot—we got 1996's CSS a couple of months ago, maybe we'll get 1999's HTML 4 in 2008?

    Next you add in the fact that most people are too lazy to even learn to spell correctly, far less learn how to type an e with an acute accent, and you have a recipe for today's state of the web.

    --
    GCHQ Quantum Insert installed. If only our tongues were made of glass, how much more careful we would be when we speak
  8. it is really a deep one. by torpor · · Score: 4, Insightful

    {disclaimer: i'm a closet fontographer.}

    i've thought about this question since 1978, as i have encountered over the years since then a grand litany of different ways of describing symbols in such a way that they can be standardly used, and i have come to a very simple answer. humans are stuck on a symbol treadmill with infinitely smooth bearings.

    fontography is a lesson of symbols .. and the description of these symbols is limited by strict hardware limits: economic, social, cultural elements all have a part to play in the definition of input devices. where i say QWERTYZXCV, you say QWERTZYXCV.

    we haven't seen terribly wide-spread specialization of symbols because of the producer-/consumer- cults of USKEY101, and peoples unfamiliarity with alt-numkeypad chops, and Mac vs. PC, and ASCII vs. UTF-8, and XML vs. .bin, and "X" vs. "Y", blah blah, ad infinitum..

    the fact is, perhaps deep down inside we know we should be grateful for what we've got, and let the "!=" and ">=" expressions, 2 lonely bytes in a vast nasty sea, stand as testament to the human desire to at least, a little bit, get along on the same key. they may not be pretty, but pretty much everyone can get to those two bytes and use them when they need to .. its only a tiny clique can do the alt-numpad thing, and even fewer who choose to jump out of the ASCII pool and towel off..

    --
    ; -- the corruption of government starts with its secrets. a truly free people keep no secrets. --
  9. data entry by TheSHAD0W · · Score: 2, Insightful

    I think a large part of it is because, even if we have the ability to display the characters, we don't have a convenient way to enter them. The keyboard doesn't have a Sine symbol key. Further, expanding the keyboard to include these symbols will just make it unwieldy. I suppose one could have the display automatically convert sequences into special characters, much like modern word processors perform auto-superscript, but this might cause problems when editing. I personally prefer it as-is.

  10. Lowest common demoninator by Craig+Maloney · · Score: 4, Insightful

    It's pretty simple: Lowest common denominator. Creating special character sets creates incompatibilities with other machines out there. That's why ASCII was such a boon, and why character sets like PETASCII, ATASCII, and others fell by the wayside. (And if you really want some character set fun, try EBCDIC sometime).

  11. And music is hard to read for non-musicians by Quarters · · Score: 3, Insightful

    \r\n, =, !=, etc... make sense to programmers. They understand the language. Just like the design of 32nd, 16th, 8th, 1/4, 1/2, and whole notes, along with extra notation to modify their true length of play and volume, makes sense to musicians. Why waste time and effort to make it readable for the masses when the masses probably don't care? If they did they'd learn to read the language.

  12. Re: \n as newline by some+guy+I+know · · Score: 3, Insightful
    And for non-visual characters like 'newline'.... what other idea, exactly, did you have?
    How about U+2424?
    Actually, that's the symbol for a graphic representing a newline (a slightly raised N next to a slightly lowered L, shrunk and crammed together into an area approximately a single em-space wide), so maybe that's not such a good idea (as how would you represent the graphic itself in a string?).
    OTOH, a \ followed by U+2424 could better represent a newline graphically in a string.

    The reason that \n seems "pretty straightforward" is that most of us are used to it.
    The concept of backslash followed by a letter representing a control character started in C in the 1960s (or possibly even in earlier languages), and has been copied into dozens of other languages, along with other things like using % in printf strings to format variables (although some languages, like Ruby, are starting to offer alternative representations to %).
    Note that, in Common LISP, a newline is represented by ~% and ~& in formatting strings, and #\Newline (spelled just that way) represents a newline character outside of formatting strings.
    In Object Pascal/Delphi, a newline is represented by its decimal or hexadecimal equivalent, #10 or #$0A.
    Some languages, like Python and sh/ksh/bash/etc., allow an actual newline in a string itself, so no representation is necessary (although Python allows \n as well, in its non-raw strings).
    Other representations that I have seen in the past include ^J and ^M^J (for line feed and carriage return/line feed as control characters) and $ (for end-of-line in regular expressions (although the $ doesn't (usually) match the actual newline itself)) and in "list" mode in vi.
    --
    Those who sacrifice security to condemn liberty deserve to repeat history or something. - Benjamin Santayana
  13. I disagree by gurps_npc · · Score: 2, Insightful
    Most of you people are listing problems related to keyboards.

    That demonstrates a lack of vision.

    MAKE A NEW KEYBOARD.

    Not that hard to do. Almost all computers have function keys on top. The majority of users DON'T USE THEM.

    Just print up some new keyboards that have single symbols representing the major programer stuff, such as >=, To use them, print them above the F1,F2,F3, etc. access them by typeing shift F1, etc. etc. Allow them to be over-riden by programs that want to over-ride it.

    If Apple did this, it would catch on instantly. In one year, Microsoft would steal the idea.

    --
    excitingthingstodo.blogspot.com
    1. Re:I disagree by gurps_npc · · Score: 2, Insightful
      Our job is nto to design a new keyboard for all languages. Non-english speakers already make their own keyboards. But for English speakers, there are a bunch of simple symbols that should definitely go in.

      ...Math...

      Greater than or equal to

      Less than or equal to.

      Not equal to.

      ...Programming...

      New line symbol.

      Is it Alphabetically equal to (does not set, only used for asking. Equivelent to EQ, could co-opt the wavy equal sign)

      Is it Numberically equal to (does not set, only used for asking, Equivelent to == in many computer languages, could co-opt the triple line equal sign.)

      Then there are the common symbols that are not on the keyboard. These include

      paragraph mark

      pound mark

      the cross used to signify footnotes

      The copyright mark

      the registered trademark

      the small circle indicating temperature.

      These 12 symbols are used throught the english world. Again, the idea is NOT to make an english keyboard useable by for other languages, but instead to expand the use of the keyboard to include the 12 most common symbols used within the english world. Non-english language keyboards should of course expand their own keyboard, but that is up to them, not those of us that speak english.

      --
      excitingthingstodo.blogspot.com