Slashdot Mirror


Mr. Pike, Tear Down This ASCII Wall!

theodp writes "To move forward with programming languages, argues Poul-Henning Kamp, we need to break free from the tyranny of ASCII. While Kamp admires programming language designers like the Father-of-Go Rob Pike, he simply can't forgive Pike for 'trying to cram an expressive syntax into the straitjacket of the 95 glyphs of ASCII when Unicode has been the new black for most of the past decade.' Kamp adds: 'For some reason computer people are so conservative that we still find it more uncompromisingly important for our source code to be compatible with a Teletype ASR-33 terminal and its 1963-vintage ASCII table than it is for us to be able to express our intentions clearly.' So, should the new Hello World look more like this?"

728 comments

  1. The thing with ASCII by enec · · Score: 5, Insightful

    The thing with ASCII is that it's easy to write on standard keyboards, and does not require a specialized layout. Once someone can cram the necessary unicode symbols into a keyboard so that I don't have to remember arcane meta-codes or fiddle with pressing five different dead keys to get one symbol, I'm all for it.

    --
    I'm sorry, I only accept criticism in the form of sed expressions.
    1. Re:The thing with ASCII by mail2345 · · Score: 1

      Different keyboard manufacturers have a selected range of keys that their keyboards can type, which will probably get confusing:
      "Only hp unicode keyboards have the print symbol. For other unicode keyboards or ASCII keyboards, type alt-8713 to get the print symbol."

    2. Re:The thing with ASCII by angus77 · · Score: 4, Informative

      Japanese is typed using a more-or-less standard QWERTY keyboard.

    3. Re:The thing with ASCII by thenextstevejobs · · Score: 0

      The thing with ASCII is that it's easy to write on standard keyboards.

      Why should the notations which we use to express our programs be limited to 'standard keyboards'?

      I'm sure there could be decent schemes for writing alternate symbols with meta-keys and such. Learn a new keyboard layout, it won't kill you. Reminds me of folks refusing to learn a language other than C++/Java/whatever because they are afraid it'll cause them some irreparable mental damage.

      For example, I'd love use standard logic symbols to express statements in my day to day coding, why not? Well, because I'm writing C/Ruby. But hey, I'd like to see them available as an alternative perhaps, not required?

      Shooting this down because the keyboard we're all using in 2010 doesn't accommodate it well doesn't seem like the best way forward to me. Seems like the whole Ford 'faster horse' sort of thing. Take a longer view. Think about the possibilities. Maybe there's some cool things this would open up.

      I don't think lines of code are taking up storage such that we'd have any trouble moving to UTF-8, 16, or any other longer format than ASCII.

      --
      Long live the BSD license
    4. Re:The thing with ASCII by MichaelSmith · · Score: 5, Informative

      Japanese is typed using a more-or-less standard QWERTY keyboard.

      Tediously.

    5. Re:The thing with ASCII by arth1 · · Score: 5, Insightful

      Once you've had to do an ad-hoc codefix through a serial console or telnet, you appreciate that you can write the code in 7-bit ASCII.

      It's not about being conservative. It's about being compatible. Compatibility is not a bad thing, even if it means you have to run your unicode text through a filter to embed it, or store it in external files or databases.

      It'd also be hell to do code review on unicode programs. You can't tell many of the symbols apart. Is that a hyphen or a soft hyphen at the end of that line? Or perhaps a minus? And is that a diameter sign, a zero, or the DaNo letter "Ø" over there? Why doesn't that multiplication work? Oh, someone used an asterisk instead of the multiplication symbol which looks the same in this font.

      No, thanks, keep it compatible, and parseable by humans, please.

    6. Re:The thing with ASCII by Firethorn · · Score: 1

      True, but I remember reading that it was complex enough that many reporters preferred to dictate to a voice recognition system than to try to type their story in.

      It seemed to work a lot like predictive keystrokes on a cellphone.

      I have no real problems with allowing Unicode in programming, but I'd see it mostly being used in defining strings and naming variables, and even then you'd probably want to restrict the character set simply because so many of the symbols look so similar, yet are so different code wise.

      Sure, with Unicode you could probably make every function a single character, but human minds aren't really written to recognize that. Sure, Chinese and Japanese do the 'one word one character' thing, but they also end up with like 3 character sets and a substantial additional amount of work learning said additional characters.

      --
      I don't read AC A human right
    7. Re:The thing with ASCII by Anonymous Coward · · Score: 0

      And it takes 12 of them to do it. One to dictate and 11 to type.

    8. Re:The thing with ASCII by js3 · · Score: 1

      doesn't mean the rest of us need to suffer the same fate

      --
      did you forget to take your meds?
    9. Re:The thing with ASCII by Ernesto+Alvarez · · Score: 3, Informative

      Japanese is typed using a more-or-less standard QWERTY keyboard.

      ...then requiring the input to pass through what amounts to a tokenizer to get the phonetic spelling, and into another program, which needs a database of words and has to prompt you for each one in order to select the proper one from a list.

      Not something as simple as writing ASCII by a long shot.

    10. Re:The thing with ASCII by BronsCon · · Score: 2, Informative

      I recommend that everyone GOAT SEe the parent video ASAP

      --
      APK quotes people (including myself) without context and should not be trusted. Just thought you should know.
    11. Re:The thing with ASCII by fwarren · · Score: 1

      Perhaps you like the idea of ColorForth http://www.colorforth.com/

      Color is used to denote different states. The equivalent in C would be where includes are in RED, and functions are in blue while their parameters are in green and some {} are no longer needed because of the color coding.

      Mind you, it totally sucks if you are color blind. But you are able to create significantly terser code because of the amount of syntax that is represented by color.

      --
      vi + /etc over regedit any day of the week.
    12. Re:The thing with ASCII by Anonymous Coward · · Score: 0

      Non-ASCII characters won't be possible to type easily, no, but IDEs can assist: e.g. converting <= to the math symbol for LE, or != to the math symbol for NE. The previous example would increase readability, but it would also kill the "write anywhere" principle.

      I know this post will incite a lot of negative responses saying things like "That idea blows!" or "I'll sacrifice my girlfriend before programming in that language!" Let me say that I'm only mentioning a possibility, not recommending the idea as a good or clever one.

    13. Re:The thing with ASCII by jonbryce · · Score: 1

      Japanese characters are mostly sound-based rather than meaning-based, though a single Japanese character will generally map to two latin characters.

    14. Re:The thing with ASCII by Kagetsuki · · Score: 1

      I get the impression you have no idea what an IME is....

    15. Re:The thing with ASCII by Anonymous Coward · · Score: 0

      Japanese characters are mostly sound-based rather than meaning-based

      Only for some values of 'character'.

    16. Re:The thing with ASCII by Kagetsuki · · Score: 5, Informative

      I'm Japanese, so let me clarify how entering Japanese works here: Japanese is composed of two sets of Kana (characters with no meaning but they have a sound) and Kanji (characters with meaning). To enter a word in Japanese, let's say the word "Me/I" you would hit hit a key to activate your IME [input method editor] - usually the key on the top left of the keyboard, then type "watashi", just like that, and you would get in kana (hiragana). Next hit the space key, that converts it to kanji. Now hit enter to finish input or just start typing your next word. You can also enter multiple words, hit space, and then break up and convert the sentence all at once. It is not difficult, you don't actually need a special keyboard, and I've never heard of anybody capable of using a keyboard using voice recognition because they found the act of entering in words laborious.

    17. Re:The thing with ASCII by Anonymous Coward · · Score: 0

      The thing with ASCII is that it's easy to write on standard keyboards, and does not require a specialized layout.

      Define "standard keyboard".

      You do realize that the keyboards in the US (and English Canada) are different than those in France, are different than those in Germany, are different than those in Japan, are different than those in China, are different that those in....

    18. Re:The thing with ASCII by offsides · · Score: 0

      The original article talks about "write-only languages." I see the proposal to allow Unicode source as creating "read-only languages" - hard to write, impossible to debug, but fairly easy for someone to read, even if they're not a programmer. This proposal isn't about giving programmers more power to code, it's about making it easier for non-english speakers who aren't coders to read the code that their programmers write.

      Real programmers understand the fundamental limitations of a parser/compiler, as well as the need for a consistent set of reserved words and symbols...

    19. Re:The thing with ASCII by Anonymous Coward · · Score: 0

      I don't see any keys.
      Just a huge, gaping hole.

    20. Re:The thing with ASCII by Tablizer · · Score: 3, Informative

      It'd also be hell to do code review on unicode programs. You can't tell many of the symbols apart. Is that a hyphen or a soft hyphen at the end of that line?

      If you want to test and/or frustrate a newbie, replace one of those in their program and see how long it takes them to fix it.

      The first time I ran into something like that it took me a good while. I ended up comparing hex dumps to find it. I should have just retyped the suspect code sections from scratch instead, but I was determined to get to the bottom of it and find out exactly why it crashed.

      I certainly turned me back into an ASCII fan.

    21. Re:The thing with ASCII by Anonymous Coward · · Score: 3, Funny

      I find the act of reading your descriptions laborious, and have decided to never bother learning Japanese just so I don't have to put up with that kind of thing EVER.

      If I could, I'd probably go about eliminating the whole language just as a gift to humanity.

      But I'm still working on my ultimate plan for destruction of English, and that has priority.

    22. Re:The thing with ASCII by znerk · · Score: 2, Informative

      Japanese characters are mostly sound-based rather than meaning-based, though a single Japanese character will generally map to two latin characters.

      I assume you're referring to the katakana, here... So, yes, using a phonetic set of approximately 50 characters, your writing will be sound-based.
      Unfortunately, you are also underinformed, as there are actually 3 character-based written languages in use in Japanese writing.
      Part of the problem, here, would be that the same (spoken) word can refer to many different concepts, and the (non-phonetic) written language reflects the meanings, rather than the pronunciation. For example:

      Some Japanese words are written with different kanji depending on the specific usage of the word—for instance, the word naosu (to fix, or to cure) is written as "" when it refers to curing a person, and "" when it refers to fixing an object.

      Bah, slashdot apparently doesn't like my attempt to use the characters. Whatever, the quoted text is from the linked article.

      --
      This work is licensed under a Creative Commons Attribution 3.0 Unported License.
    23. Re:The thing with ASCII by Anonymous Coward · · Score: 1, Funny

      You must be new here

    24. Re:The thing with ASCII by Z34107 · · Score: 3, Insightful

      I find the act of reading your descriptions laborious, and have decided to never bother learning Japanese just so I don't have to put up with that kind of thing EVER.

      "That kind of thing" is quite literally hitting the "space" key between words. I'm surprised you managed to put up with it long enough to finish your post.

      --
      DATABASE WOW WOW
    25. Re:The thing with ASCII by bsa3 · · Score: 1

      +1. You can pry my vt320 out of my cold, dead hands. And, no, that's not a vt320 emulator.

    26. Re:The thing with ASCII by swdunlop · · Score: 1

      And then we'd have "Mr. Moore, Tear Down This Colored Wall" rants on Slashdot. There's no pleasing some people. ;)

    27. Re:The thing with ASCII by Ichijo · · Score: 0

      Once you've had to do an ad-hoc codefix through a serial console or telnet, you appreciate that you can write the code in 7-bit ASCII.

      What prevents Telnet from ever using Unicode?

      You can't tell many of the symbols apart. Is that a hyphen or a soft hyphen at the end of that line?

      Do you think language designers would really use both symbols and not make them interchangeable?

      --
      Any sufficiently unpopular but cohesive argument is indistinguishable from trolling.
    28. Re:The thing with ASCII by angus77 · · Score: 2, Interesting

      Japanese is typed using a more-or-less standard QWERTY keyboard.

      Tediously.

      Not in the least. I do it every day at work. It takes little more effort than writing in English. Unless, of course, your Japanese reading skills are not up to the job---but that won't be the fault of the keyboard.

      Please let me emphasize that typing with a QWERTY keyboard is the standard way of typing in Japan. In fact, despite the existence of other methods, I don't know a single person who actually uses those methods.

    29. Re:The thing with ASCII by Tumbleweed · · Score: 1

      Japanese is typed using a more-or-less standard QWERTY keyboard.

      The three commonly-used character sets used in Japanese (Hiragana, Katakana, and Kanji) are hardly a _good_ example of how to do written communication. They actually only *need* one of them (Hiragana or Katakana). The argument that in written communication, you can use the others for various things neatly sidesteps the fact that in spoken Japanese, there is no such distinction.

    30. Re:The thing with ASCII by modmans2ndcoming · · Score: 1

      bingo!!

    31. Re:The thing with ASCII by Anonymous Coward · · Score: 0

      Whatissoimportantaboutthespacekeyanyway

    32. Re:The thing with ASCII by Angst+Badger · · Score: 5, Insightful

      Funny you mention it, but the first thing I thought of was Japanese text entry, followed by the autocorrect/text-expansion facility that most word processors have, which is much the same thing applied to western languages. I've also thought it would be good to be able to make use of mathematical symbols for, you know, mathematics. The same could be said of word processor-like formatting for comments. I'm dubious about using it for actual code, but I'm open to having my mind changed about that.

      (Color-as-syntax has already been done in Chuck Moore's latest implementation of Forth. It's not a bad idea, though I suspect it works better with low-level languages like Forth than it would with a higher level language.)

      The second thing I thought of was what I always think when someone starts complaining about what languages should and shouldn't have, which is this: Quit bitching and go implement it, smart boy. Come up with something good, and I'll use it, but I am not about to run out and implement someone else's ideas. I have a day job where I get to do that all fucking day long, and they actually pay me. And contrary to popular belief, ideas are cheap and plentiful, including good ideas. The time, effort, and dedication that it takes to actually implement them are what's in short supply.

      --
      Proud member of the Weirdo-American community.
    33. Re:The thing with ASCII by Z34107 · · Score: 5, Funny

      Typing Japanese is exactly like typing in English - you press the "space" key between words. The IMEs are pretty smart, and usually the first kanji is the one you want. If it's not you might have to press "space" a second or third time, but it's rare to have to dig through a giant list of kanji to get what you want.

      So, you might have to hit the space key more often if you're typing Japanese. Or, you might not - you can space-to-kanji entire sentences at once, whilst the romance languages are stuck hitting space between every word like shmucks. Except for the Germans. I don't think their language uses spaces.

      The Japanese keyboard layout also types produces kana (most of which are romanized with two latin characters) rather than individual letters. Instead of typing w-a-t-a-s-h-i-space, you type wa-ta-shi-space.

      So, it's really not that bad. What's worse is the irony of seeing an article on slashdot complain about the persistence of ASCII. I mean, really now, slashdot.jp manages to display non-ASCII characters.

      --
      DATABASE WOW WOW
    34. Re:The thing with ASCII by A.+Bosch · · Score: 1

      Where's my mod points? Spot-on observations.

      --
      Where there is the necessary technical skill to move mountains, there is no need for the faith that moves mountains.
    35. Re:The thing with ASCII by Anonymous Coward · · Score: 0

      To enter a word in Japanese, let's say the word "Me/I" you would hit hit a key to activate your IME [input method editor] - usually the key on the top left of the keyboard, then type "watashi", just like that, and you would get in kana (hiragana). Next hit the space key, that converts it to kanji.

      Umm.. OK... and that's simpler than just typing "Me/I" how?

      It is not difficult

      Sure, except for all that stuff you just wrote.

    36. Re:The thing with ASCII by macshit · · Score: 1

      The three commonly-used character sets used in Japanese (Hiragana, Katakana, and Kanji) are hardly a _good_ example of how to do written communication. They actually only *need* one of them (Hiragana or Katakana). The argument that in written communication, you can use the others for various things neatly sidesteps the fact that in spoken Japanese, there is no such distinction.

      Written and spoken language are not exactly the same though -- written language tends to be more formal and more complex. Kanji actually help comprehension/speed a great deal when reading (they both literally disambiguate, and make good use of the human visual system's pattern-recognition ability).

      There have been numerous attempts to try and eliminate them, both under pressure from the American occupying forces after WWII, and due to linguistic fashion later on, but in the end, they've always failed, essentially under the weight of common sense: you don't make wholesale changes to such a fundamental part of your culture without an awful lot of justification!

      In current times, the trend may actually be in the other direction, as computers have made kanji usage easier for many people (even if at the same time they've had negative effects on many people's ability to write them by hand)...

      --
      We live, as we dream -- alone....
    37. Re:The thing with ASCII by Darinbob · · Score: 1

      I agree. Non-ascii is difficult to create on most English keyboards. If you have Latin characters, they'd be hard to create on non-Latin keyboards as well. ASCII, despite the "A" is the closest to a commonly usable international character set.

      Then there's the snag that a characters may look the same, or extremely similar, but have different codes. I've definitely had problems in the past with three periods vs an ellipsis, or dashes versus some odd glyph that looks like a dash.

    38. Re:The thing with ASCII by rainer_d · · Score: 4, Funny

      Typing Japanese is exactly like typing in English - you press the "space" key between words. The IMEs are pretty smart, and usually the first kanji is the one you want. If it's not you might have to press "space" a second or third time, but it's rare to have to dig through a giant list of kanji to get what you want.

      So, you might have to hit the space key more often if you're typing Japanese. Or, you might not - you can space-to-kanji entire sentences at once, whilst the romance languages are stuck hitting space between every word like shmucks. Except for the Germans. I don't think their language uses spaces.

      NatürlichhabenwirLeerzeichen!

      --
      Windows 2000 - from the guys who brought us edlin
    39. Re:The thing with ASCII by mywhitewolf · · Score: 1

      Fascinating, but it does bring to light the fallacy of the suggested changes to programming.

      you essentially type in a command, which is then converted into something that the reader would understand change the reader from a Japanese reading person to a computer compiler and it shows how unnecessary it would be to have additional characters.

      i guess if you are proficient typing in "!!" (2*(shift+1)) as you are typing in "" (alt+787) it won't make any difference? i don't understand how they would consider character limitation an issue with coding? could we move to a uni-character system for coding? sure, but why? its completely unnecessary.

    40. Re:The thing with ASCII by Anonymous Coward · · Score: 0

      He writes:

      > Why not make color part of the syntax?

      Well, I can think of two reasons:

      1) because that would be stupid, and
      2) it would be an unnecessary burden for anyone who is color-blind.

      Dear Mr Kamp,

      Thank you so much for coming up with a new way to make things more difficult and obtuse for programmers. I was deathly afraid that no one could find a way to make programming more ridiculous and error-prone, but you sir, came up with a wonderful way to make things sillier and more difficult for everyone. Bravo! Perhaps next you could develop something useful for campers, like dehydrated water.

    41. Re:The thing with ASCII by angus77 · · Score: 2, Informative

      The kind of opinion you'd come to from checking Wikipedia rather than actually using it.

      Millions upon millions of Japanese (and some non-Japanese, like myself) have found the IMEs to be more than satisfactorily efficient and easy to use. Not only that, but they sometimes have predictive input as well (especially on cell phones), which makes typing in Japanese even faster and easier.

    42. Re:The thing with ASCII by ciggieposeur · · Score: 1

      What prevents Telnet from ever using Unicode?

      Nothing, so long as both sides understand TELNET BINARY, the locale of the remote system uses a Unicode encoding (in practice this means UTF-8), your terminal understands Unicode (which excludes the FreeBSD console and most hardware terminals), your terminal is capable of *displaying* the Unicode (which excludes all non-GUI terminals except the Linux console), and your font has all the glyphs you need.

      Writing a terminal emulator teaches one quite a bit about Unicode. Once it's working though it's quite nice.

      Do you think language designers would really use both symbols and not make them interchangeable?

      I imagine they would have a heart attack determining which whitespace code points should count as token breaks, which shouldn't, and which should be forbidden entirely because they make it difficult to trust someone's patches.

    43. Re:The thing with ASCII by FatdogHaiku · · Score: 1, Funny

      So, will we be able to "print" the print symbol, or will we just fall into some recursion hell until the paper tray is empty?

      One Glyph to find them.
      One Glyph to bring them all...
      ah, never mind...

      --
      You have the right to remain sentient. If you give up the right to remain sentient, you will be elected to public office
    44. Re:The thing with ASCII by sznupi · · Score: 1

      One of those is a bit more widespread though; used in many places even if, technically, some local layout was established... (where, accidentally, it appears to be often called "locale (programmers)")

      --
      One that hath name thou can not otter
    45. Re:The thing with ASCII by wvmarle · · Score: 1

      Sometimes I am wondering how the computing world would have looked like if the computer had been invented in Japan or in China, instead of in the US. Just thinking of the script they use, which is of course much harder to design a keyboard for than English or most other Western languages. Even Russian would be trivial to use.

    46. Re:The thing with ASCII by angus77 · · Score: 1

      We don't actually need to write English the way it was pronounced centuries ago. But see how far you get in life trying to spell everything as pronounced.

      You actually need all three character sets to function in Japanese society today. Keep in mind that Japanese newspapers have the highest circulations in the world. Think the Japanese are willing to go back to an oral society? I live here. They put a lot of emphasis on written communication---and even talk about the written characters in speech when clarifying something they've said that may have been ambiguous when spoken. (I'd love to give some examples, but /. is ASCII-only)

      Besides, we were talking about character input, not the appropriateness of the Japanese writing system. Typing Japanese is easy, even on Western keyboards.

    47. Re:The thing with ASCII by wvmarle · · Score: 1

      It is surely harder than entering English, if only looking at the learning curve. The same I see around me regarding Chinese (I can't type Chinese but I see other people do it).

      Just like Japanese it's a hack: you only use QWERTY because it was the standard already, before computers came to Asia. Then Asian input was patched on top of that. Were the Asians to have invented the computer, the keyboard would surely have looked very very different.

    48. Re:The thing with ASCII by Hylandr · · Score: 1

      What the OP fails to realize is that a good percentage of computers in government and scientific circles are still running those character sets. Satellites Oceanographic Buoys, Launch systems, Modern weaponry, Avionics, etc.

      There is no quick fix, and chances are we are stuck with it. - Deal.

      - Dan.

      --
      ~ People that think they are better than anyone else for any reason are the cause of all the strife in the world.
    49. Re:The thing with ASCII by guyminuslife · · Score: 1

      The thing is, as far as I'm concerned as a Westerner, the existence of a phonetic Japanese alphabet eliminates the need, or desire, for kanji.

      --
      I don't believe in time. It's a grand conspiracy designed to sell watches.
    50. Re:The thing with ASCII by BrokenHalo · · Score: 2, Interesting

      Seriously. This programmer (I use the term loosely) has problems with expression? If this is the case, he needs to go back to school and try learning assembly or fortran programming. Any program worth writing can be coded in fortran, and if it can't be coded in assembler, then it can't be done at all.

      If he really wants to go into creative writing, we might remind him that the 26 letters of the alphabet were good enough for Shakespeare.

    51. Re:The thing with ASCII by angus77 · · Score: 1

      Proof of this is the fact that the number of standard characters (Jouyou kanji) has been increased twice: once in 1981 (from 1850 to 1945), and again in 2009 (to 2136). And there are many non-standard kanji that are in common use (the kanji for ichigo "strawberry"and kani "crab" are two that come immediately to mind).

    52. Re:The thing with ASCII by angus77 · · Score: 1

      Suffer what fate?
      Being forced to continue to used a standard QWERTY keyboard?!

    53. Re:The thing with ASCII by Anonymous Coward · · Score: 0

      Japanese are crazy, there don't feel guilty when using romanized layout for their language. Don't Japanese have a native keyboard layout so they have to use the latin abcd keyboard layout. They first need to learn Roman keys to just type Japanese how funny !!!

    54. Re:The thing with ASCII by Fareq · · Score: 1

      thing is, I have a { key and a } key, but not a [RED] key, nor a [BLUE] key. Which means either memorizing IDE-specific keyboard shortcuts (and then relearning when i have to use a new tool), or lots of clicky-clicky with the mouse, which takes longer and does more damage over time.

    55. Re:The thing with ASCII by angus77 · · Score: 1

      Sure, Chinese and Japanese do the 'one word one character' thing, but they also end up with like 3 character sets and a substantial additional amount of work learning said additional characters.

      You're a bit confused---Classical Chinese had the 'one word one character' thing, and Japanese has three character sets (five if you include Arabic numerals and the extensive use of the Roman alphabet).

      Book sales are amongst the highest in the world, and Japanese newspapers have the highest circulations in the world. The extra time spent reading in school must be paying off for the Japanese.

    56. Re:The thing with ASCII by angus77 · · Score: 1

      **PSST!**

      I think he was emphasizing "the act of reading" rather than "your descriptions" as being laborious.

      Sometimes it helps to finish reading someone's post before responding to it.

    57. Re:The thing with ASCII by Artifakt · · Score: 1

      Admiral Halsey, you should drink your warm milk and take your nap now, Sir.

      --
      Who is John Cabal?
    58. Re:The thing with ASCII by SanityInAnarchy · · Score: 1

      Well, what better keyboard design would you suggest?

      I mean, so long as we're still coding in English, most of the keyboard is going to be the standard alphabet. There's not too much space beyond that for us to hit quickly while typing. If you want to add standard logic symbols, that probably means means killing something else -- and what would you suggest?

      If it's just a one-to-one substitution, even if you managed to configure your text editor to only swap, say, % for a logical implication symbol (looks vaguely like =>), you've made it truly obnoxious for people to start using your language. If you make it multiple keystrokes -- say you take => and replace that with implication -- then really, the language you've produced would work just as well in ASCII, so you're just introducing unnecessary confusion.

      I probably wouldn't mind too much if an editor did the work for me, allowing me to still type in ASCII that looks reasonably close, and see it converted to the appropriate UTF8... But while I appreciate awesome tools, from text editors to full-featured IDEs, I prefer a language which doesn't force me to use any particular software. All my text editors, SCMs, pretty much everything I could ever possibly want to fire at a piece of software knows how to deal with ASCII. While most of it would be fine reading Unicode, I have no idea how any of it deals with allowing me to write Unicode.

      --
      Don't thank God, thank a doctor!
    59. Re:The thing with ASCII by Dahamma · · Score: 2, Insightful

      This proposal isn't about giving programmers more power to code, it's about making it easier for non-english speakers who aren't coders to read the code that their programmers write.

      No, actually, it's not. Java already allows Unicode variable and function names. This is about using Unicode in basic syntax of the language, which is IMO idiotic if you ever want your language to be adopted. I mean, he says it himself in the last paragraph - he didn't use any Unicode in his article because he was using vi, which makes it difficult - not to mention even if it was doable, it would be tedious as hell with a standard keyboard.

    60. Re:The thing with ASCII by angus77 · · Score: 3, Insightful

      And we only use the Roman alphabet for English because it was a widespread standard, even though we already had a functioning writing system that suited Englisc better had worked for us for centuries (runes). We mangle the system with digraphs and multiple sounds for many of the characters (especially the vowels). It's a hack. We've made do.

    61. Re:The thing with ASCII by Anonymous Coward · · Score: 0

      Well, I believe Perl 6 allows some unicode operators, so maybe you want to head yonder.

      The canonical example would be APL, though, and as far as I know all its successor languages decided to go straight ASCII.

      In the end graphical symbols aren't any more expressive than textual ones so it makes sense to go for convenience.

    62. Re:The thing with ASCII by Jeremi · · Score: 1

      Do you think language designers would really use both symbols and not make them interchangeable?

      If languages only used the symbols as synonyms for each other, then supporting both of them wouldn't help anyone. It would only make it more difficult to do meaningful diffs on your source code.

      --


      I don't care if it's 90,000 hectares. That lake was not my doing.
    63. Re:The thing with ASCII by robbak · · Score: 2, Interesting

      i wonder if those with a non-alphabetic language, like the various Chineses or Japanese, would have chosen a keyboard at all? It seems to me that the keyboard is really designed around a language that uses a limited number of glyphs. Even the addition of dïaçrìtîçs are really hacks on the keyboard.

      --
      Prediction for end of Universe #42: Fencepost error in Quantum_bogosort.cpp
    64. Re:The thing with ASCII by Jurily · · Score: 5, Insightful

      If he really wants to go into creative writing, we might remind him that the 26 letters of the alphabet were good enough for Shakespeare.

      Exactly. Completely Missing The Point at its best.

      1. The idea behind modern programming is reducing complexity. That can't really be done by using symbols no other programmer has ever seen before.
      2. Most programming fonts go out of their way to make those symbols look distinct. You simply have to know if that's a zero or an upper-case O. Imagine trying to figure out if that there is a Greek upper-case Omega or a "Dentistry symbol light down and horizontal with wave" (taken from TFA).
      3. APL died for a reason.
      4. Author cites C++ operator overloading as a good thing. 'Nuff said.

    65. Re:The thing with ASCII by wvmarle · · Score: 1

      Plus the fact that a spoken language changes - good chance you would not be able to understand English as it was spoken say 500 years ago. They would not only have used different words, also used a different pronunciation.

      Spelling is much more fixed, and will contain artifacts of old pronunciations as well.

      And when it comes to phonetic scripts, English is actually a quite poor example. Many European languages are pronounced much more like their spelling than English is.

    66. Re:The thing with ASCII by robbak · · Score: 1, Insightful

      So this means that you cannot touch-type in Japanese?

      (clarification: touch-typing here not just meaning not looking at the keys as you type, but not looking at the output either. If you have to check the screen to see that it has entered the right 'kanji', then surely transcription is slow.)

      I'm sure the hacks to enter [whatever the correct word to describe these 'symbolic' alphabets is] languages are very well resolved. It is just a pitty that they have to exist. But I have no idea what the perfect Japanese-entry device would be. Maybe a 'chord' keyboard, where two keys are pressed simultaneously - but the learning curve!

      --
      Prediction for end of Universe #42: Fencepost error in Quantum_bogosort.cpp
    67. Re:The thing with ASCII by Anonymous Coward · · Score: 0

      Goatse ^^^ You've been warned

    68. Re:The thing with ASCII by modecx · · Score: 1

      Gesundheit

      --
      Constitutional rights may be respected, repealed, or modified; but they must never be ignored.
    69. Re:The thing with ASCII by robbak · · Score: 1

      Yeah, this.

      Everything that he stated in that article really sounds like 'more fancy IDE', to me. You want to break out a function to a side box, and colour it pink-on-blue-with-chartreuse-surround? Make your IDE do that for you. Put formatting codes in specially created comments if you have to.

      All this has very little to do with the language.

      --
      Prediction for end of Universe #42: Fencepost error in Quantum_bogosort.cpp
    70. Re:The thing with ASCII by cgenman · · Score: 1

      Not to be too pragmatic since I'm genuinely curious, but as a native Japanese speaker and typist, is it slower to read ASCII versions of your language? Or have you adapted to it like a third way of writing?

    71. Re:The thing with ASCII by Scrameustache · · Score: 1

      The thing with ASCII is that it's easy to write on standard keyboards, and does not require a specialized layout.

      Once someone can cram the necessary unicode symbols into a keyboard so that I don't have to remember arcane meta-codes or fiddle with pressing five different dead keys to get one symbol, I'm all for it.

      I have a key just for the letter ù on my keyboard, and I know only one word that uses that letter: the french word for 'where'. Worse, there's a key for the accent, so there's no need to have a key for that letter-accent combination. Pleasepleaseplease make it the variable declaration symbol! Get this thing some use!

      --

      You can't take the sky from me...

    72. Re:The thing with ASCII by robbak · · Score: 1

      Well, anyone who wants that can program an IDE to do it: Show the NE symbol whenever it finds "!=" in the text, and save "!=" when it was displaying the NE symbol. Those who like pretty symbols (and I admit that they are nice) can have them, and those who want to type in != >= get their wish too. And if you don't want to change anything you do, you don't have to.

      --
      Prediction for end of Universe #42: Fencepost error in Quantum_bogosort.cpp
    73. Re:The thing with ASCII by fdrebin · · Score: 1

      Great... and if you're color blind, you can't work in this language? Ever?
      (Me, I'm blue-yellow colorblind, yellow looks blazingly bright, blue is mostly invisible to m)
      /F

      --
      Stupidity... has a habit of getting its way.
    74. Re:The thing with ASCII by fyngyrz · · Score: 1, Informative

      You know, this was tried. It was called APL. It sucked, and I mean, like the environment outside the ISS.

      We like our set of alphanumerics because it's easy to recognize, easy to compound into much more complex entities that are *also* easy to recognize, and it leverages an entire lifetime of familiarity with text.

      So please. Go away. Go away yelling about glyphs, or go away quietly, but just... go away.

      --
      I've fallen off your lawn, and I can't get up.
    75. Re:The thing with ASCII by Captain+Segfault · · Score: 1

      That works until you actually want to read something -- keeping in mind that modern IMEs make it no harder to enter kanji than to read them.

    76. Re:The thing with ASCII by spongman · · Score: 2, Insightful

      Typing Japanese is exactly like typing in English

      hardly. when you type in english, you think of the word and you type in the letters of that word.

      when you type in japanese, you think of the word, then you have to translate it at least once (maybe twice) in your head before you have a list of roman letters to type. Then you have to assist the computer in guessing the reverse of the translations you just did. certainly, much of this is simple for the typist, and for the computer, but it's fundamentally different from typing a roman language.

    77. Re:The thing with ASCII by aliquis · · Score: 1

      Solution:

      Let IBM design it.

      Standard.

      Seriously, what would the alternatives be? I assume Google comes closest ..

    78. Re:The thing with ASCII by paedobear · · Score: 1

      For newspapers, it's just that they have the pushiest subscription-sellers in the world - as for books, I suspect that you'll find that a lot of those book sales are actually "just" comics - and the book/magazine/paper sales have been in freefall for a good 10 years, with Japanese publishers taking a ludicrously luddite stance to the idea of digital sales (and when they do float the idea of digital sales, the royalty rates they offer are insulting, a tiny fraction of that for analogue media even though their costs are so much lower)

    79. Re:The thing with ASCII by paedobear · · Score: 1

      And how many jouyou kanji does the average person actually know - maybe 500? They can probably only read just north of 1000, too, and this only gets worse as you look at younger people. However, "ichigo" is on the jinmeiy (personal name) kanji list, so people should know it, and "kani" pretty much never gets used - it's rarer than "bara" (rose) in my experience, and almost as rare as "arigatou", which I have only ever seen used by native Chinese speakers.

    80. Re:The thing with ASCII by paedobear · · Score: 1

      If you mean the "English" keyboard, you have the US keyboard, UK keyboard, Irish keyboard, US (Apple) keyboard, UK (Apple) keyboard, Microsoft Extended UK keyboard, Microsoft Extended US keyboard etc etc etc...

    81. Re:The thing with ASCII by angus77 · · Score: 1

      I'm not a native speaker, but I've been studying the language for almost 15 years.

      If you put a text in Roman characters in front of a native Japanese speaker, they would most definitely find it much slower to read. Why? Because they're not used to it. I'm a native English speaker and *I* have a lot more trouble reading Japanese in Roman characters than I do in Kanji/Kana.

      (I had a script I had to read out loud recently. It was given to me in Roman characters, because it was assumed that, as a foreigner, it would be easier for me to read. I couldn't get through it without making mistakes every paragraph. I retyped it in Kanji/Kana and the only problem that remained was my funny accent.)

      In order to compare the two, you'd have to have a Japanese speaker who had made a habit of reading in Roman characters. Then you could find out which was faster/more efficient.

    82. Re:The thing with ASCII by mwvdlee · · Score: 1

      That and a font that can accurately render the different characters.
      Making the distinction between I, l, | and 1 is hard enough as it is. Why would you want to add a few hundred nearly identical-looking glyphs?
      These visual distinctions are a nuisance in plain text, they are compiler errors or bugs in code.

      --
      Slashdot social media options: AIM, ICQ, Yahoo, Jabber and Mobile Text. Why no MySpace?
    83. Re:The thing with ASCII by sznupi · · Score: 1

      If you really wonder, it's not too hard to get a bit of a taste.

      BTW, the German Z3 probably deserves to consider it as the first computer; a lot of early history was obscured due to circumstances or outright classified for many years (for pragmatic reasons, too - "3rd world" places using Enigma machines after ww2 were supposed to believe in their unbroken record...). Check also Plankalkul.

      --
      One that hath name thou can not otter
    84. Re:The thing with ASCII by AuMatar · · Score: 1

      They tried it. It was called APL. It was universally hated, because it required special keyboards with tons of extra keys or complex combinations of modifiers. It died for a reason.

      --
      I still have more fans than freaks. WTF is wrong with you people?
    85. Re:The thing with ASCII by angus77 · · Score: 1
      You'll find that an awful lot of those books (the majority, in fact) are not comics---and also that comics have a lot of words in them.

      Nobody pushed our newspaper subscription us us. We went out and got it ourselves.

      According to this guy:

      But the Japanese ebook market is already huge. In 2009 ebook sales in Japan totaled $600 million, more than triple the US sales, and without any Kindles!

      So I don't know what your bizarre non-sequitur about digital sales was supposed to be about.

    86. Re:The thing with ASCII by sznupi · · Score: 1

      I mean one of the listed in AC post; "English" wasn't among them.

      --
      One that hath name thou can not otter
    87. Re:The thing with ASCII by angus77 · · Score: 1

      This is the first time I've heard "touch-typing" defined as not being allowed to look at the screen. If that's the case, then I can't touch-type in English, either.

      Would you like to introduce any other hoops for us to jump through?

    88. Re:The thing with ASCII by Anonymous Coward · · Score: 0

      What could "mostly invisible" mean?... (especially considering the sky)

    89. Re:The thing with ASCII by angus77 · · Score: 2, Informative

      500 kanji? Surely you're trolling? Elementary school children know more than that before they even get to Junior High. I know *I* can write more than that, and I've never even taken a calligraphy class. You couldn't read a (post-adolescent) *comic book* with only 1000 kanji, let alone a newspaper. And I know this because I read the newspaper every day(the Shizuoka Shinbun), not because I heard it from some asshat in an internet forum.

      I don't know the kanji for "bara", but I've definitely seen "kani" any number of times---not in texts, but definitely on signs and labels.

      "Arigatou" is certainly not something you'd see in kanji in texts, but I've been mailed with the kanji any number of times (and you'll certainly see it in the form "arigatai"). I doubt there's a junior high school graduate in this country who doesn't know the kanji for that.

    90. Re:The thing with ASCII by angus77 · · Score: 1

      Sometimes I am wondering how the computing world would have looked like if the computer had been invented in Japan or in China, instead of in the US.

      I imagine they would have designed something like APL rather than basing it on their own written language. It would certainly have been easier than flipping dipswitches and may have even caught on.

    91. Re:The thing with ASCII by paedobear · · Score: 1

      Well, the French keyboard is probably used in more different countries than the US keyboard then - does that satisfy you?

    92. Re:The thing with ASCII by sznupi · · Score: 1

      If it's really harder (at the level of "surely harder") could be probably judged only by someone who is learning both, and native with some 3rd script...preferably similarly dissimilar to both. Better yet, a sample of such people.

      --
      One that hath name thou can not otter
    93. Re:The thing with ASCII by ThePromenader · · Score: 1

      If we're going to go outside the ASCIII glyph range, then we're going to need i) an extended keyboard (or (an)other key(s) in addition to the 'cmd', 'alt' and 'ctrl' keys?) or ii) a constantly-present glyph window. The second would be a PITA for sure. I do like the article's idea of using color, though.

      --

      No, no sig. Really.

      ThePromenader
    94. Re:The thing with ASCII by paedobear · · Score: 1

      I said know not "can read" - and please try not to come across as some sort of smug "I actually USE Japanese, I'm not some neckbeard type" because I do too. I don't read Japanese newspapers because the quality of the reporting is some of the worst in the world, but I've forgotten more about the Japanese comic book industry - from an insider pov, not reader pov - than you have ever known. In fact, in terms of comic books, look at teen comic books published in the 80s, and recently, and you'll see that a lot of them now use rubi for all kanji terms, when that was really a young-kids thing 15 or 20 years ago, there's a literacy based reason for that... Kani I've never seen on labels - including in supermarkets, we shall have to agree to disagree (is this a Yokohama vs Shizuoka thing?)

    95. Re:The thing with ASCII by HonIsCool · · Score: 2, Insightful

      When I think of, por ejemplo, the word pronounced as 'hait' [*], I don't have to "translate" that at all. No, sir! Just type it straight in, exactly as it is pronounced: "height" of course! =)

      [*] IPA doesn't work on /.

      --
      "Give me six lines of C++ code written by the most competent programmer, and I will find enough in there to hang him."
    96. Re:The thing with ASCII by wvmarle · · Score: 1

      Comparing input speeds (in words per minute) the speeds people quote themselves (I learned that from reading many resumes - admittedly no hard data) for English are roughly double that of Chinese, and always a lot higher.

      Chinese characters using Changjie input (one of the hardest to learn but most used at least for input of traditional characters as it's a pretty efficient one) require 3-4 keystrokes each; English words are on average about 5 letters. Not counting spaces.

    97. Re:The thing with ASCII by fbjon · · Score: 1

      Sure, except for all that stuff you just wrote.

      Except the only difference is that at the end of a word/sentence/paragraph, you hit space, check for typos, then move on. That's it. Still not difficult, and you should probably try it before commenting. Find a japanese phrase with both a (phonetic) transliteration and the japanese original. Use the IME key to switch between Japanese and QWERTY, type in the transliteration and press space.

      --
      True confidence comes not from realising you are as good as your peers, but that your peers are as bad as you are.
    98. Re:The thing with ASCII by sznupi · · Score: 1

      Probably not; not as a single layout.

      --
      One that hath name thou can not otter
    99. Re:The thing with ASCII by paedobear · · Score: 1

      I think that his sales figures are far too high - and perhaps are driven by the ludicrous cost. All you need to do is look at the Kodansha thing last week to see that a lot of stuff isn't on sale digitally, and won't be, because they're offering stupidly low rates. I don't know about novels, but original keitai comics are entirely done by amateurs - if there's one thing that Japan is not lacking in, it's wannabe mangaka - because the small ventures that are publishing them are basically expecting free work (rates of 1 to 10 yen per page) - if you're counting that, then the web-comic industry is part of the western publishing industry too (I expect it's not currently counted as being such.) The stuff that's available on phones is from a tiny pool of stories - Softbank have something like 200 different comic publishers on their comics menu, but they are all selling from the same pool of 100 or so titles from only one or two publishers, and they all want 1.5x to 2x the price that buying a tankobon would cost, even for comics that are pushing 30 years old.

    100. Re:The thing with ASCII by paedobear · · Score: 1

      There are less French layouts that I know of than English layouts (even than just of US layouts) - the French have been historically much more sucessful at linguistic domination/control of the language than the English have

    101. Re:The thing with ASCII by sznupi · · Score: 1

      Might be not enough to draw conclusions? Were those speeds of native speakers? Some cultural differences causing... different way of reporting own speed? Native EN writing CN, or vice versa? Plus, typical speed is not a direct indicator of how hard it is.

      --
      One that hath name thou can not otter
    102. Re:The thing with ASCII by wvmarle · · Score: 1

      That are all native Chinese speakers, who have learned English as second language.

      I don't know of any native English that can type Chinese (because learning to speak it is hard so westerners in Hong Kong do not even try, learning to read/write is even harder so basically no non-Chinese ever becomes anywhere near native-quality in that).

      After living eight years in Hong Kong I can have a simple conversation in Chinese, can read most of the menu in restaurants, and bits and pieces elsewhere. With that I'm way ahead of most Westerners.

    103. Re:The thing with ASCII by Bodrius · · Score: 2, Funny

      Wasn't there already a seminal paper on this topic?
      http://public.research.att.com/~bs/whitespace98.pdf

      --
      Freedom is the freedom to say 2+2=4, everything else follows...
    104. Re:The thing with ASCII by angus77 · · Score: 1

      The only Japanese comic I read these days is Jin. No ruby there.

      And teh fact that you don't read the papers doesn't invalidate that millions of people do. And I've never heard of an illiteracy problem localized to Yokohama. And it's certainly not a Shizuoka thing, as I spent 6 years commuting to monthly meetings in Nagoya and never ran across anything resembling an illiteracy problem.

      You'll also notice that I've said *I* can *write* more than 500 kanji---and when I still have to ask others to help me when I can't remember how to write one, and it's rare that the native Japanese hasn't been able to help me immediately (there may be the occasional false start, but English speakers have the same problem with their spelling).

      Oh, and:
      kani
      kani
      kani
      kani
      kani
      Would you like some more examples? It took me all of 2 minutes to find these and mark them up.

    105. Re:The thing with ASCII by RockDoctor · · Score: 1

      The thing with ASCII is that it's easy to write on standard keyboards, and does not require a specialized layout.

      What language do you write in? Oh, probably you're one of those English speakers. It must be sad being so restricted.

      --
      Birds are not dinosaur descendants;birds are dinosaurs, for all useful meanings of "birds", "are" and "dinosaurs"
    106. Re:The thing with ASCII by angus77 · · Score: 1

      Well, I don't read a lot of comic books (actually, that's a lie---I read a lot of Western comics), so I won't claim to know the details of the web-comics industry here (nor do I care, honestly).

      Paper books, however, are plentiful---tons of bookstores, new and used, well-stocked libraries. And most importantly, I actually see people reading every day. Y'know, books, the ones without pictures.

      The illiteracy in Japan meme is a myth. I have yet to meet someone who honestly can't handle a newspaper.

    107. Re:The thing with ASCII by 91degrees · · Score: 1

      Why should the notations which we use to express our programs be limited to 'standard keyboards'?

      Because they're a standard. And they're easy to get hold of. I can write a C++ program on a netbook over telnet and never have to use the alt key. If I get sick of the pokey little keyboard I can go to any computer shop and buy a better keyboard.

    108. Re:The thing with ASCII by lxs · · Score: 1

      But apart from the alphabet, what have the Romans ever done for us?

    109. Re:The thing with ASCII by paedobear · · Score: 1

      I'm not saying that the Japanese are illiterate - I'm just saying that the "99% literacy" rate is a myth - it should be obvious from how much more complex the written Japanese language is than English / French / etc. There are plenty of sources that say that "Japan has an amazing 99% literacy rate" without mentioning that the 99% figure is an assumed rate from a UN study which grants the same 99% rate to all the other developed first-world countries. There's also the issue of "literacy" vs "practical literacy" - the UK has been concentrating on the latter for a while even though it has a seemingly high literacy rate

    110. Re:The thing with ASCII by paedobear · · Score: 1

      A quick google gives 227 million results for (hiragana) 6 million for (katakana) and 34 million for (kanji) though there's an element of linking there (where the kanji search gives pages that only have kana) so the results are totally unscientific. I also didn't say that people in Yokohama are illiterate - just that if you honestly think that most Japanese people can actually write all the jouyou kanji from memory you are sadly mistaken - why do you think that "wapuro-baka" was coined. I'm sure that at some point they knew how to write them all, but they've forgotten over time. I for one have forgotten plenty of things that I studied far more recently than when I was 16, and I don't think of myself as an idiot - there's only so many things that one can remember.

    111. Re:The thing with ASCII by Chrisq · · Score: 3, Informative

      You know, this was tried. It was called APL. It sucked, and I mean, like the environment outside the ISS.

      I thought it sucked. You thought it sucked. A load of guys from the maths department that wanted to do quick mathematical computations loved it. APLwas not meaningless symbols to everyone.

    112. Re:The thing with ASCII by angus77 · · Score: 1

      Then the onus is on you to show us where the figures are wrong.

      And if you somehow prove that the literacy rate is "really" only 97%---what have you actually proved?

      I see no Japanese people struggling with reading here. I only started to read Japanese when I was 18, and I manage to make my way through the papers, usually without a dictionary these days. I never took Japanese in college, either---it was all on my own, with the occasional private tutor. Are you seriously going to tell me that my half-assed reading skills are superior to those of native speakers who had a good decade-and-a-half head start on me (and a high level of command of the language by the time they even started to learn to read)?

    113. Re:The thing with ASCII by Chrisq · · Score: 3, Interesting

      Plus the fact that a spoken language changes - good chance you would not be able to understand English as it was spoken say 500 years ago. They would not only have used different words, also used a different pronunciation.

      That depends on what accents you are used to. Many Northern British and lowland Scotts accents were not changed by the "Great Vowel Shift" nearly as much as Southern English, Received Pronunciation, or General American.

      Being your slave, what should I do but tend
      Upon the hours and times of your desire?

      Will have an immediately obvious meaning when read in a lowland Scots accent

    114. Re:The thing with ASCII by gl4ss · · Score: 1

      you wouldn't have any more fingers either.

      if someone wants to 'code' with graphical uml editors and such, they can do so today. it's just not a very good way.

      --
      world was created 5 seconds before this post as it is.
    115. Re:The thing with ASCII by angus77 · · Score: 1
      I never claimed everyone could write all 2000-or-so jouyou kanji--that's you putting words in my mouth. You claimed:

      And how many jouyou kanji does the average person actually know - maybe 500? They can probably only read just north of 1000, too

      which is laughably false. You couldn't function in this country if you could barely recognize 1000 kanji (unless you're an English teacher or southeastasian hostess).

      As for "kani", I also never claimed that the kanji was universally used, only that it was in common use. I'd be shocked if there were many people who didn't recognize it on sight. There are an awful lot of Japanese people in Yokohama, I've heard. Print out the kanji for "kani" and pretend like you don't know what it says. Ask around and see how many adults honestly can't tell you.

    116. Re:The thing with ASCII by Anonymous Coward · · Score: 0

      Except for the Germans. I don't think their language uses spaces.

      Fromsomeonethat'slearninggerman:welldone.

    117. Re:The thing with ASCII by TheRaven64 · · Score: 0, Troll

      There are several potential problems, but also some very old solutions. The problem with your 'get a better keyboard' idea is that source code is another form of communication. I work on a couple of projects that have contributors in France, Britain, Taiwan, the USA, China, and Japan. ASCII is the lowest common denominator - all of these people use different keyboard layouts, optimised for their native language, but all of them can enter ASCII.

      The solution was available in pretty much every Smalltalk implementation from the early '80s. Smalltalk uses the caret character for return statements, but pretty much any editor will display it as an up arrow (which doesn't appear on any character that I've seen). There's no reason why the editor has to display exactly the same characters that appear in the source code, or has to insert exactly the typed characters into the source file.

      --
      I am TheRaven on Soylent News
    118. Re:The thing with ASCII by TheRaven64 · · Score: 4, Interesting

      Apple's documentation in HTML form has a few of the standard ASCII characters replaced with other unicode characters. If you copy and paste into a text editor, you get compiler warnings which seem to be saying that they're expecting the character that is there. They also sometimes contain ligatures, which you don't notice unless you look one character at a time. One of the most irritating problems I found was on the Nouveau wiki a load of constants have 0x prefixes where the x is actually a unicode multiplication symbol. Copy them into the code and it looks right, but the compiler rejects it as an invalid constant type.

      --
      I am TheRaven on Soylent News
    119. Re:The thing with ASCII by NickFortune · · Score: 2, Insightful

      I thought it sucked. You thought it sucked. A load of guys from the maths department that wanted to do quick mathematical computations loved it. APLwas not meaningless symbols to everyone.

      Right. It's a niche language, very useful for a fairly narrow subset of programmers, but something of an impediment for the rest of us.

      The point is that using an expanded set of glyphs didn't, of itself, make a language that was widely useful, let alone better. At the same time, it brought considerable drawbacks, many of which have already been mentioned in this thread.

      Of course, that doesn't mean you couldn't leverage unicode to create a more expressive syntax. But TFA doesn't really have any ideas on how this is to be done apart from "obviously, more glyphs would be better", which I think APL disproves, at least in the general case.

      --
      Don't let THEM immanentize the Eschaton!
    120. Re:The thing with ASCII by Anonymous Coward · · Score: 1

      Unsere Sprache verwendet schon Leerzeichen; sie ermöglicht es halt aber auch, Wortkonstruktionen zu schaffen.

    121. Re:The thing with ASCII by Anonymous Coward · · Score: 1, Funny

      Except for the Germans. I don't think their language uses spaces.

      Of Kurs wir usen der Spacetasten in der Deutschenlanguage, you Insensitivklod!

    122. Re:The thing with ASCII by Anonymous Coward · · Score: 2, Insightful

      Color-as-syntax has already been done [colorforth.com] in Chuck Moore's latest implementation of Forth. It's not a bad idea,

      As a color-blind person, I'd like to say... yes. Yes,it is.

    123. Re:The thing with ASCII by Anonymous Coward · · Score: 0

      I assume you're referring to Cantonese since you're in Hong Kong. Beijing dialect of Mandarin is not difficult to learn, plenty of Westerners speak it. Cantonese, while it has more tones is not impossible to learn either with some classes and elbow grease. Stop perpetuating the myth that Chinese is some mystical hard to learn language when it obviously is not.

      http://www.youtube.com/watch?v=XgHvPyOwrG4

      http://www.youtube.com/watch?v=_Ws28F2DlrA

      http://www.youtube.com/watch?v=r9-PPFA48AY&feature=related

    124. Re:The thing with ASCII by Haeleth · · Score: 1

      If by "had worked for us for centuries" you mean "had been used for a handful of short inscriptions", yeah. It's hard to call a writing system a success when the society that used it was not literate.

      Realistically, even if we suppose that runes were inherently better suited to the writing of Old English, and even if they had somehow managed to survive even through the Norman Conquest, our orthography would still have been largely fixed by the introduction of printing, so they would still suck for writing modern English.

    125. Re:The thing with ASCII by arth1 · · Score: 1

      It's not only Apple, but newer versions of Outlook too. Paste a line like PATH="$PATH:/opt/bin" and it looks OK on-screen. But if the recipient copies and pastes it into a unicode aware editor, he'll find out that it will break whatever script this was for because the quotes were Unicode quotes.

    126. Re:The thing with ASCII by jadrian · · Score: 1

      So what?

      It is also tedious to write long function names, at least it would look concise when written. Also if you have a few non-ascii symbols you use all the time, there are ways to simplify their input. Quite a few languages do this already (check some theorem provers like isabelle and coq and their proof general interfaces for emacs).

    127. Re:The thing with ASCII by Haeleth · · Score: 1

      You're a bit confused---Classical Chinese had the 'one word one character' thing

      And was an entirely written language, never natively spoken by anyone. In spoken Chinese, most words require at least two characters to write.

      Unfortunately most Chinese people believe that their language has this unique one-word-one-character property, and understandably get rather upset when some foreign scholar comes along and tells them he knows better -- so this myth is going to take a long time to die ...

    128. Re:The thing with ASCII by Anonymous Coward · · Score: 0

      Concrete...parts of our legal system (division of Civil versus Penal, etc...)...plumbing...orgies...

    129. Re:The thing with ASCII by Anonymous Coward · · Score: 1, Informative

      So, does this mean you've got Monochromacy? If this is the case, how good is your vision overall? Most people with Monochromacy typically can't see well enough to use a computer without a text-to-speech converter. If you don't have Monochromacy, then perhaps you could shift the colors so that you can see it. In it's stock configuration, if you've got red-green colorblindness, you'd have a difficult time using the language without modifying the color rules. As it stands, I'd still have to agree, it causes it's own set of problems, but there's an "expanded" version of "Color Forth" that's not as terse that doesn't rely on color for hints, that you could probably still use if there's a problem.

    130. Re:The thing with ASCII by luis_a_espinal · · Score: 1

      Japanese is typed using a more-or-less standard QWERTY keyboard.

      ...then requiring the input to pass through what amounts to a tokenizer to get the phonetic spelling, and into another program, which needs a database of words and has to prompt you for each one in order to select the proper one from a list.

      Not something as simple as writing ASCII by a long shot.

      It isn't as complex or cumbersome as many of you think either. I know, I've seem them (Japanese) going at it. The challenges of Japanese writing are not unique to typing as they also apply to hand writing. There is really no option other than do what they do with their typing software. And it is very effective and not that cumbersome for someone that already knows Kanji, Katagana, Hiragana and Romanji (the usage of the Roman alphabet for Japanese words) - namely anyone literate in Japan.

    131. Re:The thing with ASCII by Anonymous Coward · · Score: 0

      For german and possible other western locales there ist the NEO2 layout which adresses at least some of the issues of typing unicode glyphs/characters/whatever on a standard keyboard. Link (in german): http://www.neo-layout.org

    132. Re:The thing with ASCII by imakemusic · · Score: 2, Informative

      They have a point though. Presumably, if typing something up you would have to look back and forth between the source text and the screen as opposed to English where you can stare at the source text and be sure that when you press the "a" key you get an "a".

      --
      Brain surgery - it's not rocket science!
    133. Re:The thing with ASCII by angus77 · · Score: 1

      If by "had worked for us for centuries" you mean "had been used for a handful of short inscriptions", yeah.

      You mean a handful of short inscriptions that managed to survive. Also, the Rune Poem is hardly short enough to be called an "inscription", and "Solomon and Saturn" is 550 lines long.

      It's hard to call a writing system a success when the society that used it was not literate.

      ?!?!? The average Greek and Roman was illiterate, too. By that logic I suppose their writing systems were a flop as well!

      Anyways, you do realize that the point was that the Roman alphabet was not designed for (or very well suited to) English, but we've made perfectly good use of it anyways, right? Not that anyone was trying to promote runes as a system to supplant it, right? And further, that if English could hack its way around the Roman alphabet, it's not unreasonable that the Japanese could hack their way around a QWERTY keyboard (which they have managed to do).

    134. Re:The thing with ASCII by angus77 · · Score: 1

      I'm no sinologist, but wikipedia seems to suggest that Classical Chinese did reflect spoken Chinese at least to some degree, and that what you're referring to would more properly be called Literary Chinese.

      I don't remember where I read this, but I remember reading in a book years ago that Classical Chinese had a lot more phonemes and tones compared to modern Mandarin, which made the 'one word one syllable' thing practical in the spoken language. It's the loss of this variety that has made modern spoken Chinese polysyllabic.

    135. Re:The thing with ASCII by Anonymous Coward · · Score: 0

      Or, you might not - you can space-to-kanji entire sentences at once, whilst the romance languages are stuck hitting space between every word like shmucks. Except for the Germans. I don't think their language uses spaces.

      German is not a Romance language(*) it's Germanic... and it use spaces. What you think of is composite words, a very useful and expressive concept. Basically, you invent new words from a few base words as you speak/write and everybody knows magically what they mean, even if they have never been used before. Composite words exist in most Germanic languages, altough the rules for making them vary greatly. You can even create new composite words in English, but the rules for Modern English word composition is very limiting, almost useless. Most of the English vocabulary steam from old composite words from the time when English, and its predecessors, still had useful word composition.

      (*) But it use a romance alphabet.

    136. Re:The thing with ASCII by fyngyrz · · Score: 5, Insightful

      As a martial artist of many decades, I have learned to read Chinese. Both traditional characters and the nasty simplified ones. So I'm well aware of up side - the power, and even beauty, of high-speed recognition from a large symbol set.

      But writing Chinese through a keyboard or a GUI has many cautionary lessons for us here that transfer directly to the idea of a many-symbol programming language. Take Python, for instance. A beautiful language in almost every way; visually well structured, minimalist in its core tools, yet so well thought out that it is almost unlimited in what can be done with it.

      If you were, say, to create a symbol for each Python grammar atom, you'd soon have a symbol set equal to or surpassing that required for college in China... thousands of them. This takes your average Chinese person many years to learn, by the way -- and it's non-technical.

      Now, assuming you've learned these in the first place, and stipulating that somehow, you've made them as beautiful and intuitive as the language itself, how do you select these symbols when programming? Therein lies the rub, and as no one yet has come up with a good answer for Chinese, I suspect the idea desert is just as dry for Python, or any other language one might like to turn into a concise symbolic tool.

      Now, speech has very fast mapping (although you get into context a lot... for instance "ma" can mean quite a few different things) to Chinese symbols, and so one could reasonably assume that it could also have reasonably fast mapping to my hypothetical Python symbols, but speech recognition isn't ready for this yet; and a programmer speaking "Pythonese" into a microphone isn't going to be a very good cube-mate, either.

      In the meantime, I'm quite convinced that ASCII is an excellent character set for programming, and that UNICODE belongs inside quotes for use in input and output parsing, no more, no less.

      APL suffered from all of this. You needed a special keyboard, or a GUI or other mechanism to input the "simple" symbol. You had to learn the symbolic mapping. It really represents a huge extra load in aim of simplification. All of which is completely unnecessary if you simply use ASCII. And frankly... the time it takes me to type sin(x) is going to beat your mapped keyboard input time until you've been doing it for 50 years. In which time I will have leveraged my ASCII toolkit into innumerable languages, and your APL toolkit is still only enabling you to work in APL.

      So like I said... ASCII.

      --
      I've fallen off your lawn, and I can't get up.
    137. Re:The thing with ASCII by Kagetsuki · · Score: 1

      If japanese were spelled phonetically out in roman letters (english letters) it would be very hard to read. If you mean for input then no, it is not hard to input. Here is a video I found on a quick google search: http://www.youtube.com/watch?v=Dk8Ojb3bCm8 . It's as simple as it looks.

    138. Re:The thing with ASCII by Kagetsuki · · Score: 1

      There are a lot of benefits to Kanji, and I would much rather have Japanese with Kanji than without. Granted the language would be "easier" without Kanji, but if you ask me it would also be "stupider". Kanji allows you to express more than just a word - each symbol having a meaning allows you to pack a lot of information in a smaller amount of characters. And knowing the meaning of characters makes their combinations significantly more clear. I guess I feel that Kanji really gives the language an additional depth that you can't find in purely phonetic languages.

    139. Re:The thing with ASCII by LWATCDR · · Score: 1

      Thanks I was going to add see APL why this is a bad plan.
      Do we really want to make people get programmers keyboards and learn a new way to type?

      --
      See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
    140. Re:The thing with ASCII by angus77 · · Score: 1

      In the case of copying text like that, I suppose. Not something I've found myself doing, though.

      Any more hoops?

    141. Re:The thing with ASCII by Kagetsuki · · Score: 1

      I actually had no intent of agreeing with points from the article. While I feel languages should offer UTF8 support for strings at the base level - as that would make my coding life signifcantly easier (I spent several days last week trying to figure out how to get MySQL to stop mangling UTF8 strings I was handing it from a CGI script). Allowing UTF8 variables and function/object names would be neat but not really necessary. Using special characters not found in ASCII as a core part of the language would be terrible and is a profoundly stupid idea - it's just not practical or necessary.

    142. Re:The thing with ASCII by Kagetsuki · · Score: 2

      Just to clarify again I wasn't trying to support the ideas in the article, just pointing out how Japanese was entered. But for the complex mathematical symbols I very much agree, the fact I can enter in the name of a symbol and actually get that symbol as a character is great. That does not however mean we should replace "!=" with "", I'd quickly get sick of having to constantly active the IME just to code.

    143. Re:The thing with ASCII by Jesus_666 · · Score: 1

      thing is, I have a { key and a } key, but not a [RED] key, nor a [BLUE] key. Which means either memorizing IDE-specific keyboard shortcuts (and then relearning when i have to use a new tool), or lots of clicky-clicky with the mouse, which takes longer and does more damage over time.

      Which only makes sense since colorForth looks like it was explicitly designed to draw aggro from developers.

      --
      USE HOT GRITS WITH STATUE OF NATALIE PORTMAN (NAKED AND PETRIFIED)
    144. Re:The thing with ASCII by alta · · Score: 1

      Regardless of what it actually was, the poster made it sound like my old cell phone using it's weird sort of assistive typing. It was beneficial there because I only had 8 keys to choose from, so I could hit the 1, and by the end of the word it would usually figure out if I wanted an A, B or C. The longer the word the more accurate. But the problem is, it wasn't very natural and you always had to check that it actually put in the word you wanted. Maybe it's not that difficult and maybe your description was more complicated than it really is.

      1. Hit a key to activate IME. Is this a modifier, win+I? F12? SysRq+alt+ctrl+Pause?
      2. Type a word
      3. Computer converts to kanji? Is this subjective or objective? Is it ALWAYS correct, or is there some AI here interpreting what you mean. If not a 1to1 kana to kanji, then is it 90% accurate? 80%?
      4. Hit enter and you're done. Ok, so I hit Enter, I'm done. Oops, no I'm not want to keep going. Do I start at 1 or 2?

      This may be a fine way to type a letter to someone, write a term paper or something else. But this doesn't sound like a lot fun with programming. Text flows from word to word. Code (or at least in the languages I've written in) don't flow in long sentences. It's a lot of short lines. Can't imagine having to go in and out of context between line 12 and line 13.

      I don't agree with the GP's apparent dislike for Kana/Kanji but I do agree with the first part. It seems laborious and I wouldn't want to try to program in it.

      My question is this. Is Kanji something that's been around for a long time or something made to accommodate computers?

      --
      Do not meddle in the affairs of sysadmins, for they are subtle, and quick to anger.
    145. Re:The thing with ASCII by imakemusic · · Score: 1

      Well, I imagine it would be hard if you don't speak Japanese...

      Nah, just kidding. I've got no point to prove here, I was just pointing out that robbak's comment wasn't entirely without merit.

      --
      Brain surgery - it's not rocket science!
    146. Re:The thing with ASCII by AmiMoJo · · Score: 2, Informative

      My experience is with Japanese but they share the Chinese writing system (as well as their own).

      While there are a large number of symbols most of them are made up of two or more other, simpler symbols. If you find a symbol you don't know you can often guess the general meaning just from the simpler ones it is made up from.

      That is not totally unlike how words in English work. Often they are made up of smaller parts or derived from other words.

      To bring this back to programming I'm not sure there is much to be gained by extending the available symbols. I don't feel any great desire to type the greater-than-or-equal-to symbol instead of >=.

      --
      const int one = 65536; (Silvermoon, Texture.cs)
      SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
    147. Re:The thing with ASCII by hey! · · Score: 1

      Well, keyboards are neither here nor there. It simply makes sense to build programming languages out of Latin symbols, digits, arithmetic operators, and a small, common set of punctuation (".") and glyphs ("/"). Really, TFA is are naive about the problems of K&R C, which have little to do with ASCII per se. The problem with C wasn't that it didn't have enough distinct looking operators, the problem was the nature of the data types that those operators acted upon.

      C was designed to be a replacement for assembly language that allows programmers to express themselves in terms of multi-operator expressions, structured flow of control, and functions. If you look at C's data types, they represent the state of hardware artifacts in a kind of idealized, standardized machine: bytes (chars), words (ints), long words (long), single (float) and double (double) floating point numbers and finally, memory addresses. It was up to the programmer to provide semantics in K&R C. "If (a | b | c)" is almost the exact equivalent of how you would perform a test in assembler: store (a | b) in d; store (d | d) in d; jump on non-zero d ... The programmer is responsible for expressing what he means in terms of mechanical artifacts.

      We used to say that C was "weakly typed"; but that implicitly frames the question in a way that is foreign to C's initial goals, which where to make an expressive, low-level (close to the iron) language you could write most of an operating system in. Thus C was less than perfect from an application programmer's standpoint. Take the C "char"; it really represents a byte (which initially was understood to be the smallest addressable collection of bits, not necessarily 8 bits). This also happened to be very convenient in the 7-bit ASCII days for representing characters, but even in C's heyday a byte wasn't big enough to represent the value space of many languages' character sets. An application programmer has little or no need for a data type with the semantics of C's "char"; he only uses "char" because the C language doesn't provide him with what he really needs: a data type that represents a single glyph within a string. If he called "fgetc" on an alphanumeric file, sometimes he'd get eight bits, other times sixteen, but he would neither know nor care. If he wrote "fooString[1]" sometimes he'd get something offset eight bits from the start of "fooString", other times sixteen or even thrity-two, because he doesn't really care about memory layout. An operating system writer *does* care about the layout of memory, because semantically he's moving bytes around, not characters. He knows darn well the size of an 'int' on his target platform, and how struct members are aligned; it's part of his model of his problem domain.

      C only took off as an application programming language because of the limitations of early microcomputers (typically 32K of RAM, about 1 MHz clock, eight and sixteen bit registers). C could target such systems very effectively.

      --
      Post may contain irony: discontinue use if experiencing mood swings, nausea or elevated blood pressure.
    148. Re:The thing with ASCII by Luyseyal · · Score: 2, Informative

      I've also thought it would be good to be able to make use of mathematical symbols for, you know, mathematics. The same could be said of word processor-like formatting for comments. I'm dubious about using it for actual code, but I'm open to having my mind changed about that.

      Yeah, I like the idea of TeX-style typing that autoparses to a "nice" display. You can edit the display or drop to TeX (or Maple or whatever) input if you need more specificity.

      I'm not sure the benefit conveyed is sufficient to overcome the awkwardness (if you've ever used a Maple worksheet for programming, you'll understand what I mean), but I would like to see an editor take advantage of the beauty, even if the code itself is ASCII.

      -l

      --
      Help cure AIDS, cancer, and more. Donate your unused computer time to worldcommunitygrid.org. Join Team Slashdot!
    149. Re:The thing with ASCII by enec · · Score: 1

      Actually I use a Finnish/Swedish keyboard and speak Finnish. I am well aware of different keyboard layouts with accents or special characters. However, even those share most of the keys with an US English keyboard layout.

      A keyboard with a-z, the usual punctuation marks, and possibly a few accented or umlaut characters is and will be the de facto keyboard unless we start replacing normal text with unicode characters. Until that day (which I hope never comes!), you'd have to use a seperate keyboard for programming on this unicode programming language. And as already said on this thread, they tried it with APL and it failed horribly.

      --
      I'm sorry, I only accept criticism in the form of sed expressions.
    150. Re:The thing with ASCII by Anonymous Coward · · Score: 0

      Phonetically, usually. There's no good universal phonemes for most Unicode.

    151. Re:The thing with ASCII by Lewah · · Score: 1

      Nine times out of ten even Nihon-jin (damn you /. for not letting me type in Unicode!) type in romaji. I don't get the "tediously"-bit.

      When I think "wakarinai" in response to your comment, I simply type "wakarinai" and (assuming I'm not actually trying to post it to /.) viola!; the kana simply appear. What's tedious about that?

      What people are failing to realize from a development perspective is that while it may take you more keystrokes to type fewer symbols (in the case of kana/kanji), ultimately, reading it back and quickly identifying variables, routines, etc... at a glance is the end goal - not typing less.

      --
      Good karma is like social intolerance; apparently everyone has it but me.
    152. Re:The thing with ASCII by Firethorn · · Score: 1

      I've never heard of anybody capable of using a keyboard using voice recognition because they found the act of entering in words laborious.

      It was on the news quite some time ago, and I think it was in reference to how things worked BEFORE cellphones became common.

      --
      I don't read AC A human right
    153. Re:The thing with ASCII by vegiVamp · · Score: 1

      So what our man is whining about is that he wants a code editor with context-sensitive text completion ?

      And here I was, thinking I used that a decade ago.

      --
      What a depressingly stupid machine.
    154. Re:The thing with ASCII by Altus · · Score: 1

      Other than the fact that the word "Watashi" is longer than the word "I" it really isn't any more difficult it is in english. You type the word and you hit space, everything else is taken care of for you.

      --

      "In America, first you get the sugar, then you get the power, then you get the women..." -H. Simpson

    155. Re:The thing with ASCII by Anonymous Coward · · Score: 0

      The idea behind modern programming is reducing complexity. That can't really be done by using symbols no other programmer has ever seen before.

      Is this a joke? You use "methods" you've never seen before all the time. A programmer has a serious problem if he has to read every line of code before he can start getting his job done.

      There is only so far you can reduce complexity, and one of the most important ways of reducing complexity is to call things what they are. Languages that quantify over functions need syntax for it. Languages that construct functors need syntax for it. Languages that construct monads need syntax for bind and return. Oh dear me, but you've never seen the symbols before! :0(

      Well you better cope. Mathematics is coming to get you.

    156. Re:The thing with ASCII by RDW · · Score: 1

      'Do we really want to make people get programmers keyboards and learn a new way to type?'

      Well, I was going to write a post explaining how I think this would be a fantastic idea, but unfortunately I couldn't find the percontation point on my keyboard:

      http://en.wikipedia.org/wiki/Irony_punctuation

    157. Re:The thing with ASCII by Anonymous Coward · · Score: 0

      NatürlichhabenwirLeerzeichen!

      For all you Yanks, this literally translates as: Ofcoursewehaveemptysymbols.

    158. Re:The thing with ASCII by Anonymous Coward · · Score: 0

      ASCII is good. Never mind the language input; even ancient versions ADA supports LATIN-1 input. But for crying out loud: do you *really* wanna read comments in whatever language? ASCII sofar has effectively killed off every serious intention to use any other lang than english in comments.

    159. Re:The thing with ASCII by Anonymous Coward · · Score: 0

      Be that as it may, most Japanese prefer qwerty over jis. The jis layout suffers from all kinds of horrors, including a lot of common characters located on the digits row and other faraway corners and bad character location optimisation. In other words, when using the jis layout many typists are actually slower than they could be.
      Back on topic, while I do appreciate being able to use accents in identifiers now, I think that the experiment the article author envisions has been done already. It has a tla and a cult following, but almost no major projects use it. I wonder why.

    160. Re:The thing with ASCII by modmans2ndcoming · · Score: 1

      It was never implemented.

    161. Re:The thing with ASCII by gullevek · · Score: 1

      Some people user the direct hirgana input mode, which is why all japanese keyboards have the hiragana on the keys, which, if you know where they keys are can speed up your input a lot. Instead of typing 7 letters for the word Nihongo, you type just three (Ni Ho N Go). Mobile phone input is done that way.

      Still, this is not efficient for programming languages.

      --
      "Freiheit ist immer auch die Freiheit des Andersdenkenden" - Rosa Luxemburg, 1871 - 1919
    162. Re:The thing with ASCII by fdrebin · · Score: 1

      What could "mostly invisible" mean?... (especially considering the sky)

      It means I can't see it, as in the apparent intensity as I see it is roughly about 1/4 to 1/3 what others see. Hard to measure things like that.
      It also means that I wear a narrow palette of colors to stay out of trouble.

      --
      Stupidity... has a habit of getting its way.
    163. Re:The thing with ASCII by shugah · · Score: 1

      Yeah - like there aren't already enough sources of errors in source code.

      --
      If you aren't part of the solution, then there is good money to be made prolonging the problem
    164. Re:The thing with ASCII by shugah · · Score: 1

      At some point, asian languages are going to have to be dragged kicking and screaming into, oh, maybe the 8th century BCE. Non-phonetic (logographic or idiographic) writing systems are an obstacle to literacy and communication. To be fully literate in English, with a 10,000 word vocabulary, you need to know 26 characters. To be fully literate in Chinese, with a 10,000 word vocabulary would likely require knowledge of 3000 to 5000 characters.

      --
      If you aren't part of the solution, then there is good money to be made prolonging the problem
    165. Re:The thing with ASCII by Anonymous Coward · · Score: 0

      I can get all the unicode characters I need from my keyboard with the neo-layout (http://www.neo-layout.org). In particular many math-related glyphs. And it's also a lot more ergonomic then querty.

    166. Re:The thing with ASCII by gullevek · · Score: 1

      If you can only read that much, you won't be able to do much here. I admit that the younger they get, the less or different Kanji they use in their mail conversation, but what gets written in Newspapers, Literature, etc uses the full set of them. Trust me, they can read them.

      But they might not be able to write them, because they mostly use mobile phones or PC write. And it also depends on their work and what they do and how educated they are.

      --
      "Freiheit ist immer auch die Freiheit des Andersdenkenden" - Rosa Luxemburg, 1871 - 1919
    167. Re:The thing with ASCII by wvmarle · · Score: 1

      Vietnamese made the switch already, with the addition of numerous accents. Chinese has been tried, failed, because there are too many homophones, especially if the tones are left out. Mandarin nowadays is learned using Pinyin romanisation. And the convenience of using the same script for various related dialects/languages which is a great plus for their character script.

      That notwithstanding it's a bitch to learn.

      Other Asian languages: Japanese has a phonetic version already; afaik there is a way to write that language 100% phonetic. Korean is some character script again. Thai is written phonetic, using their own letters, as Russian and Arabic. India not sure, has several languages/scripts. Tibetan is phonetic, with their own script. Indonesian, Malay, Tagalog (Philippines) also use the Western script without any extras like Vietnamese. Other languages I don't know.

    168. Re:The thing with ASCII by sznupi · · Score: 1

      Seems it was, around a decade ago. Even if it was ignored at the time of its original publication, getting hands on recent implementation might nicely add to the "what if?" experience.

      --
      One that hath name thou can not otter
    169. Re:The thing with ASCII by Skal+Tura · · Score: 1

      In other words you are saying that:
        - Chinese traditional characters are fast to read, slow to type
        - Which implies that using ASCII is fast to type, slow to read

      I don't know how to read kanji or other asian traditional character sets, so i do not know personally how they compare, but from the sound of it, i drew right assumptions.

      What if one could use ascii to describe the symbols.

      Like typing:
      sin[space](x)[space]
      turned it into a symbol resembling sin(x)

      space key would be what would tell when to do the conversion [space][space] would translate to actual space.

      fast to type, fast to read.
      Of course, you'd have to learn two SYMBOL sets (ASCII is just symbols representing characters afterall), so still a downside

      Just food for thought....

    170. Re:The thing with ASCII by Skal+Tura · · Score: 1

      that being said, what about "descriptive language"

      ie.

      create dialog with subject "foo" and text "bar", buttons "x", "y", "z".
      Map button X to Q, Y to W and Z to E

      or more techie and easier to translate programmatically:


      create dialog {
      subject is "foo"
      text is "bar"
      button x maps to method Q
      button y maps to method W
      button z maps to method E
      }

      now you see, we do this kind of thing all of the time, it's called abstraction, and it's tailored towards the coder, or project code guidelines with still having the "scent" of who made it. That translates to in real practical word:

      $thisDialogButtons = array( new dialogButton( array('x' => array('title'=>'something', 'callback' => 'class->methodQ') .....

      Of course multiline etc. That is a complex example, more abstracted would be something like:


      $dialog = new dialogWindow(
      array(
      'subject' => 'foo',
      'text' => 'bar'
      'buttons' => array(
      'x' => '$someClass->Q',
      'y' => '$someClass->Y',
      'z' => '$someClass->Z'
      )
      ));

      while not being symbols, it is descriptive and meaning is revealed to reader immediately. Ofc, bad coders aren't able to do things this simple :( They will create something more like this:


      $id = system->createDialog();
      system->elements->changeContent ( system->elements->findTitle( system->dialogs->find($id) ), 'foo');

      and do it for all windows they create, worst of all if they use array_map, array_filter etc. recursive functions without commenting! If you are lucky, they also use archaic functions which has no place in modern day programming, such as array_push instead of $array[], array_insert, array pointers, no classes but global functions etc.!

      But a skilled coder, does abstractions as needed and makes the actual business logic easily readable, sometimes even for non-techies to understand what's going on!

      That's the power of descriptive programming. Keep It Simple, Stupid! That has the added power of doing just that: Keeping things simple, when done right.

      Tho, many mediocre coders take this 5 steps too far and start to abstract trivial things like outputting a html form:


      • Username:
      • Password:

      becomes something nightmarish like this:

      $loginForm = new htmlForm(
      'title' => 'Login form',
      'id' => 'loginForm',
      'fields' => array(
      'username' => new htmlFormField('text', 'Username'),
      'password' => new htmlFormField('password', 'Password'),
      'login' => new htmlFormSubmit('Login!')
      ),
      'method' => new htmlFormMethod('post'),
      'action' => array(
      router::findPath( router::action::getSelf, true),
      system::callback( system::actions::find('controller', 'loginAction')
      ),
      'decorators' => array(
      'line' => new htmlFormDecoratorCustom('

      ', '

      '),
      'lineItemsHeader' => new htmlFormDecoratorCustom('

      • ') ....
        )

        );

        You catch the drift. This was done in similar fashion out of my head than how Zend Framework does. In ZF case the end result is that creating a simple login form took 6hrs, it took on modern Quad Core server with no other activity 1.5+ seconds to render, had to be cached with custom hard to create unique keys per session, and end result was TOTALLY mess

    171. Re:The thing with ASCII by fyngyrz · · Score: 1

      Which implies that using ASCII is fast to type, slow to read

      No, it doesn't imply that at all. ASCII is fast to type, fast to read. It's not some kind of "complement" of a pictographic character set; alphabetic sets are simply different. Pictographic sets are inherently inferior to alphabetic sets like english, Korean hangul, Arabic, or Japanese hiragana for keyboarding applications because the input method for a pictographic set sufficient for a programming language is going to be comparatively cumbersome (leaving aside the learning curve.)

      --
      I've fallen off your lawn, and I can't get up.
    172. Re:The thing with ASCII by sznupi · · Score: 1

      The question isn't about how many there are, their existence - but how widespread the few (in practice) of them are.

      One stark example: despite my place having for a long time its own version of qwerty with diacritics, computer keyboards are virtually exclusively of the standard US layout (physically, what this is about; function is slightly modified of course - right Alt acts like AltGr; and nullifying it is a matter of one quick & easy keyboard shortcut - too easy in fact, people often do it accidentally and get totally confused / "the keyboard is broken"). The only "PCs" I used which had a "proper" local keyboard were old Mac Classic, LC475 and some middle-size Quadra (in itself exceedingly rare machines here back then); more recent Macs (much more popular now, relatively; but still quite rare) don't come with such very often - however slightly "weird" the Apple keyboards might be anyway, people perceive their US variant as more "standard & expected national keyboard" than the ones with diacritics...
      A quick search for "typist layout" (how it is called; though vast majority of people aren't even aware of this, and indeed of its existence!) on local auction service, among 1200+ keyboard offers, gave 2 results - one of them a 10+ year old Apple one, the other some new HP one (I'm slightly surprised / would say it's a lot, when it comes to new ones). Biggest online product catalog doesn't have the category and few variants of search term didn't find anything.

      Similarly (if not so extreme) in few nearby places. Apart from their qwertz, standard qwerty is also widely used in Czech Republic and Slovakia (even if in their case AltGr, diacritics, etc. are typically printed on the keys - probably partly because they made an unfortunate choice of nonintuitive positions for letters with diacritics, not "on top"/as a modifier of pure latin ones - that is still essentially a standard US layout, and it's not too hard to see/buy a keyboard without local symbols ... one might as well simply add them with a permanent marker). At least Hungary, Romania, Moldavia, Bulgaria and Netherlands are similar. Few linguistic families already, and only my local examples.

      Now, from glancing at Wiki - the two most prominent places of Francophone, France and Canada, have different keyboard layouts (FR qwerty similar to US vs. FR azerty). France isn't very rigorous itself - Canadian multilingual qwerty is apparently very easy to find, as well as... Portuguese or US international (which isn't at odds with standard one at all). Even neighboring Belgium made changes. Now, I don't think a lot of places in the Francophone would be more rigorous than France about using the "proper" keyboard...

      In contrast "U.S. keyboards are used not only in the United States, but also in other English-speaking places (e.g., Australia, English Canada, Hong Kong, New Zealand, South Africa, Malaysia, the Philippines and India)" - most of the Anglophone, it seems (excluding only UK itself and Ireland). So, regarding your last post - it would seem that in former colonies FR-US cancel each other out at best (for azerty) - and then there are quite a few non-EN places using US qwerty.

      Checking the biggest potential market, and one which might dictate things in the future ... contrary to what AC said, apparently the most popular keyboard in China is a standard US qwerty, with some additions on top of existing keys (without modification of the underlying layout, just like in my area), used phonetically with their input method software (BTW, at least in the case of languages from my area it's not much of a stretch to say that the standard US layout, with its latin letters, is better for them than for English - their alphabets are much more phonetic than EN one, they are generally very close to latin pronunciation)

      So yes, there definitely exists a standard keyboard layout. And even if not used everywhere, it's apparently easy to obtain worldwide anyway...

      --
      One that hath name thou can not otter
    173. Re:The thing with ASCII by sznupi · · Score: 1

      Seems HK uses actually one of the fastest input methods - but one which is quite difficult to learn. While some other (slower...) methods can be learned rapidly.

      Yup, the relationship between speed and difficulty is tenuous at best... (nvm using that speed in comparisons with EN)

      --
      One that hath name thou can not otter
    174. Re:The thing with ASCII by chelberg · · Score: 1

      APL was awesome! I wrote more write-only code in APL when working for IBM than I have ever since. IBM machines had APL keyboards which worked just fine :). Seriously, APL is great for many mathematical/matrix related purposes.

    175. Re:The thing with ASCII by modmans2ndcoming · · Score: 1

      When you talk about something being the first computer, you cannot count a vanity implementation based on the documentation decades after it was conceived.

      Might as well call Babbage's Analytical engine the first computer while we are at it then.. even though it was never built... someone is building it as we speak so when they complete it, it will immediately give Babbage credit for building the first computer right?

    176. Re:The thing with ASCII by sznupi · · Score: 1

      I assumed you were talking about Plankalkul, since talking about Z3 doesn't make any sense - it was fully built, fully functional in 1941.

      --
      One that hath name thou can not otter
  2. Don't we all know... by AsmCoder8088 · · Score: 2, Insightful

    "Syntactic sugar causes cancer of the semicolon" - Alan Perlis.

  3. Project Gutenberg by symbolset · · Score: 5, Insightful

    Michael decided to use this huge amount of computer time to search the public domain books that were stored in our libraries, and to digitize these books. He also decided to store the electronic texts (eTexts) in the simplest way, using the plain text format called Plain Vanilla ASCII, so they can be read easily by any machine, operating system or software.

    - Marie Lebert

    Since its humble beginnings in 1971 Project Gutenberg has reproduced and distributed thousands of works to millions of people in - ultimately - billions of copies. They support ePub now and simple HTML, as well as robo-read audio files, but the one format that has been stable this whole time has been ASCII. It's also the format that is likely to survive the longest without change. Project Gutenberg texts can now be read on every e-reader, smartphone, tablet and PC.

    If you want to use Rich Text format, or XML, or PostScript or something else then fine - please do. But don't go trying to deprecate ASCII.

    --
    Help stamp out iliturcy.
    1. Re:Project Gutenberg by shutdown+-p+now · · Score: 5, Insightful

      If you want to use Rich Text format, or XML, or PostScript or something else then fine - please do. But don't go trying to deprecate ASCII.

      This is false dichotomy. Plain text can be non-ASCII, and ASCII doesn't necessarily imply plain text. All the formats you've listed allow to add either visual or semantic markup to text, whereas ASCII is simply a way to encode individual characters from a certain specific set. They do not propose to move to rich text for coding, but to move away from ASCII.

      There are still many reasonable arguments against it, but this isn't one of them.

    2. Re:Project Gutenberg by snowgirl · · Score: 1

      They do not propose to move to rich text for coding, but to move away from ASCII.

      This is a bit of a false dichotomy as well. An ASCII-7 text is identical to the UTF-8 encoding of the same text.

      There are a few issues with Unicode, in that CJK characters are lumped together by semantics, while LGC are not. Thus, while simplified Chinese, traditional Chinese, and Japanese may all write the same "character" differently, they are all represented by the same codepoint, while "o" despite being pronounced identically from the most common Latin-based written languages to Cyrillic are written with different codepoints, even despite having identical appearances.

      Either way, for instance Perl, supports code written in UTF-8, which is awesome, and it's fairly unicode agnostic about everything. So being able to code using variable names written in your own language, vs. transliterating them into Latin characters is a huge benefit... but ultimately only a minor factor in programming.

      The matter still remains that programming languages are heavily dependent upon English for keywords and such, and as a result, are heavily dependent upon some representation thereof.

      But all of this ignores the matter that ASCII is a subset of Unicode anyways... so why be so dorky about "zomg, get rid of ASCII!!!!" it's retarded...

      --
      WARNING! This girl exceeds the MAXIMUM SAFE standards established by the FDA for BRATTINESS
    3. Re:Project Gutenberg by Netbrian · · Score: 5, Informative

      This is untrue.

      First off, Simplified and Traiditional characters are separated in Unicode.

      Second off, Cyrillic characters and Latin characters have always been considered two different scripts, while Chinese logographs are considered to be the same script, used in different contexts.

      See http://unicode.org/notes/tn26/.

      In any event, it would make good sense for programming environments to be able to handle Unicode source.

    4. Re:Project Gutenberg by pz · · Score: 5, Insightful

      When I was a young graduate student building my first experimental setup, a professor who was older and wiser than me suggested that data should be saved in ASCII whenever possible because space was relatively inexpensive and time is always scarce. Although I thought that a bit odd, I did follow his advice.

      The result? I can use almost any editor to read my data files from the very start of my career, closing in on 30 years ago. Just this past week, that was an important factor in salvaging some recently-collected data. In contrast, I can't always read the MS Word files -- an example of an extended character set -- from even a few years ago, and I sure as hell can't view them in almost any editor. Sure, with enough time, I can or could, figure out how to read them, but, as the wise professor rightly pointed out, time is scarce.

      Thus, compatibility is important, and the most compatible data and document format is human-readable plain ASCII.

      --

      Put my fist through my alarm clock with its ding-dong death inside my ear. - The Blackjacks.
    5. Re:Project Gutenberg by Anonymous Coward · · Score: 0

      I love ascii ebooks. They work fine on my cellphone or indeed any of my ebook readers.
      I just wish Project Gutenberg had a MUCH better interface. Whats with the downloaded book names just being random numbers? Maybe im doing it wrong, currently I have to rename them as I download so I can tell what they are later on...

    6. Re:Project Gutenberg by icebraining · · Score: 1

      I'm pretty sure plain UTF-8 will be readable for many, many years. Will I disagree with using characters outside the ASCII to program, future compatibility doesn't seem to me as a valid reason to keep it.

    7. Re:Project Gutenberg by Anonymous Coward · · Score: 0

      MS Word files are not an example of an extended character set, you need to educate yourself.

    8. Re:Project Gutenberg by Anonymous Coward · · Score: 0

      Since space is indeed inexpensive, save as .doc and as .asc (.txt)

      Problem is of course when the "cabinet-drawer-file" metaphor goes overboard you won't be able to find your file anymore.

    9. Re:Project Gutenberg by paedobear · · Score: 1

      Considered by the unicode consortium - who seemed to spend more time discussing whether or not to add a Klingon language plane than discussing Chinese-Korean-Japanese. They're as connected as Roman-Cyrillic-Russian characters are, and the ignorance of the unicode consortium has led to millions of localisation bugs, and a far reduced use of unicode in China, Japan, and Korea than in the west. Oh and there are a bunch of characters that are only one-way mappable JIS/SJIS/EUC to Unicode due to botched mapping tables (at the very least for perl and java's unicode)

    10. Re:Project Gutenberg by symbolset · · Score: 1

      Good point. Maybe one day Unicode will win out, or perhaps EBCDIC will have a resurgence. 'twixt now and then it's best to write the text in ascii, perhaps with a well-documented human-readable escape table for symbols that aren't represented - perhaps even a complete Unicode escape table current to the document. Then in 2050 when somebody wants to use the data again in BufTable or DECSHIN or whatever they're using then, they can rewrite the presentation filter and leave the underlying text in its pristine ascii condition for future data archaeologists. Those future data archaeologists, in 2100, will likely be writing those presentation filters in C, in ascii, if current trends hold true. They may not speak English, but if the raw text is in ascii, they'll be able to figure it out. Unicode? How many revisions will Unicode see between now and then? Thousands? The odds are slim we'll even be aligning on 8-bit words for text by then - but for ASCII one single rotting page from the manual from the first IBM PC or any of thousands of texts since then will be all they need for their Rosetta Stone.

      For all of the past that we can see, and the foreseeable future, simpler is better. It's easier to rewrite the presentation layer than to go back and validate that your filter reliably mangled the text. Remember that when the text is approved as acceptably mangled, the source is almost always tossed. 40 years later would be a bad time to discover that the site of the holy grail was encoded in some forgotten scheme thrice since remangled.

      Forgive me, but I've spent a disproportionate share of my programming time in data archaeology myself. I'm currently shepherding so much data from the '80's that it's quite not possible to validate that it's translating properly in a dozen man years - and this is a hobby for me. That data isn't 100MB. That was a lot when I started, but now my phone holds over 100x as much. Enterprise data shepherds with terabytes of legacy data, you have my sympathy.

      --
      Help stamp out iliturcy.
    11. Re:Project Gutenberg by Anonymous Coward · · Score: 0

      It would make MUCH more sense if programmers could handle Unicode source.
      Good luck with all of the visually-identical but semantically-different characters.

    12. Re:Project Gutenberg by shutdown+-p+now · · Score: 2, Insightful

      Good point. Maybe one day Unicode will win out

      It's not a question anymore. Unicode has already won. The sheer amount of other specifications and standards that reference various versions of Unicode spec is such that it's going to stick around for decades to come.

      Yes, we still don't have 100% support in software (but we do have 99%). Time will fix that.

      or perhaps EBCDIC will have a resurgence. 'twixt now and then it's best to write the text in ascii, perhaps with a well-documented human-readable escape table for symbols that aren't represented - perhaps even a complete Unicode escape table current to the document.

      For programming languages, we already have that - \u1234 or \U12345678 are used as escape sequences in C++, Java and C# for just this purpose. There's nothing stopping an IDE from rendering them as if they were actual symbols and not escape sequences, too, though I haven't seen that in practice.

      But this is purely an encoding issue, not a character set issue, which is what TFA is about. They are asking why we still design languages with syntax that is restricted to characters only present in the ASCII character set, even though Unicode has many handy symbols that can represent the same things better and/or shorter. Quote:

      Unicode has the entire gamut of Greek letters, mathematical and technical symbols, brackets, brockets, sprockets, and weird and wonderful glyphs such as "Dentistry symbol light down and horizontal with wave" (0x23c7). Why do we still have to name variables OmegaZero when our computers now know how to render 0x03a9+0x2080 properly?

      For a good example of what is possible there, have a look at Fortress [PDF] programming language, which uses various traditional math symbols heavily.

      Unicode? How many revisions will Unicode see between now and then? Thousands?

      Unicode has been there for 18 years now (the second volume of Unicode 1.0 spec was published in 1992), and we've seen 5 revisions, so the rate is roughly 1 per 3.5 years. Assuming it stays the same, we're looking at Unicode 35.0 by 2100. But it won't, because in practice it will slow down eventually as we add most (and, eventually, all) scripts that we know and care about. In fact, if you look at the recent additions to the standard, they do not affect the vast majority of texts ever created in any way.

      On the other hand, it doesn't really matter in the slightest, since Unicode versions are all backwards-compatible (characters get added, but never removed or moved around). Assuming that trait persists, they'll just use the most recent version of the spec available to them.

      But then why would things be any different for ASCII-encoded text with escapes for Unicode characters? You'd still need a Unicode character table to make sense of those escapes.

      It would seem that you're arguing that any character set other than basic Latin is not future-proof. This implies that any text written in any language other than English is also not future-proof. I think this assertion is rather Anglo-centric, and not very realistic.

    13. Re:Project Gutenberg by the_womble · · Score: 3, Insightful

      The article is talking about using unicode, not a proprietary format. Do you think it likely that future text editors will be able to handle ASCII but not UTF-8?

    14. Re:Project Gutenberg by Anonymous Coward · · Score: 1, Interesting

      In any event, it would make good sense for programming environments to be able to handle Unicode source.

      I program in C++ and C# and all the tools I use can handle Unicode source. It's 2010. What can't?

    15. Re:Project Gutenberg by jimicus · · Score: 1

      strings .doc

      works for me. Doesn't do your formatting any favours, though.

    16. Re:Project Gutenberg by TheRaven64 · · Score: 0, Troll

      For programming languages, we already have that - \u1234 or \U12345678 are used as escape sequences in C++, Java and C# for just this purpose. There's nothing stopping an IDE from rendering them as if they were actual symbols and not escape sequences, too, though I haven't seen that in practice.

      A lot of C compilers expect unicode source files. As I recall, the Windows headers are all UTF-16, so the MS compilers are designed to handle unicode input. Clang expects UTF-8 source code. The language rejects non-ASCII symbols in identifiers, but you can use them in comments and string literals. The D compiler ignores HTML markup in the source code (I think - it did ten years ago when I last looked), so you can mark up your source code in any way that you like and have this markup preserved, although it's semantically irrelevant.

      --
      I am TheRaven on Soylent News
    17. Re:Project Gutenberg by martin-boundary · · Score: 1

      How many word processors can correctly read EBCDIC?

    18. Re:Project Gutenberg by Haeleth · · Score: 1

      There are a few issues with Unicode, in that CJK characters are lumped together by semantics, while LGC are not. Thus, while simplified Chinese, traditional Chinese, and Japanese may all write the same "character" differently, they are all represented by the same codepoint, while "o" despite being pronounced identically from the most common Latin-based written languages to Cyrillic are written with different codepoints, even despite having identical appearances.

      One design principle behind Unicode is that there should be round-trip conversions available for every common legacy encoding. There already existed character encodings that included complete Latin and Cyrillic alphabets, so Unicode had to include those separately as well. There were no character sets that distinguished between the Japanese way of writing a character and the traditional Chinese way of writing it, so it was not necessary to duplicate it.

      The simplified Chinese characters generally do have separate code points, BTW. It's just traditional Chinese, Japanese, and Korean variants that are unified. This is comparable to the way that there is only one encoding of the Arabic numerals, even though there are different ways of writing them (is 4 joined at the top? does 7 have a cross-stroke? etc).

    19. Re:Project Gutenberg by Haeleth · · Score: 1

      In contrast, I can't always read the MS Word files -- an example of an extended character set -- from even a few years ago, and I sure as hell can't view them in almost any editor. Sure, with enough time, I can or could, figure out how to read them, but, as the wise professor rightly pointed out, time is scarce.

      antiword is your friend. It converts MS Word files to ASCII.

    20. Re:Project Gutenberg by pz · · Score: 1

      I don't think it likely that the vast majority of future text editors will be able to handle UTF-8 correctly. I especially don't think it likely that using UTF-8 to include special symbols specific to individual programming languages will always display the correct symbols. (And, beside that fact, the field's experience with APL should steer us very strongly away from symbol-based programming; if you have no experience with APL, please take the advice from someone who has and stay away from that style of dense expression.)

      In contrast, I do think it likely that effectively all future text editors will be able to handle ASCII correctly.

      --

      Put my fist through my alarm clock with its ding-dong death inside my ear. - The Blackjacks.
    21. Re:Project Gutenberg by WillAdams · · Score: 1

      Right and since then they have had to go back and correct all of the older texts which have any:

        - accents
        - math
        - foreign language text outside of basic ASCII (e.g., Greek)
        - quotation marks (getting the directionality correct is non-trivial 'struth and is better done as markup so that it can be automatically converted between quote styles)

      It would've been much better to've used a decent mark-up scheme such as TeX or TEI to begin with.

      William

      --
      Sphinx of black quartz, judge my vow.
    22. Re:Project Gutenberg by Anonymous Coward · · Score: 0

      In any event, it would make good sense for programming environments to be able to handle Unicode source.

      But as with so many features in programming languages, just because they're there doesn't mean that using them is a good idea. In a certain IRC channel I frequent, it's policy to ask for code snippets with comments and identifiers in English, TYVM. And that while there are currently no native English speaking regulars in the channel.

    23. Re:Project Gutenberg by ztransform · · Score: 1

      History will always be the prevailing reason why plain-text is superior.

      Going forward one will always be able to look back and understand (well coded and documented) source code in plain text. Already, however, other formats are showing their age: trying to support old binary formats such as Wordstar, Word-Perfect, Microsoft Word, Corel Draw, and various other programs that lived and died a natural life-cycle. The format wars will also be difficult to support over a long time, already the IV50 video format appears to be lost in time, and various audio encodings may disappear.

      We are fortunate that we can still read historical texts. Olde English is a little difficult to comprehend but far simpler than deciphering hieroglyphics. Surely we owe it to our successors to be able to read what we've written?

      Arguably the simplicity of the English language is also one of the reasons it is the most dominant around the world: it was easy to code 26 characters into early computers. The poor Chinese were never going to win the early technology race by trying to cram 1,000s of characters into a small number of bits. Now that technology has caught up the Chinese have a chance to do something truly revolutionary (imagine if they wrote their own native operating system!).

    24. Re:Project Gutenberg by Jesus_666 · · Score: 1

      This is a bit of a false dichotomy as well. An ASCII-7 text is identical to the UTF-8 encoding of the same text.

      Of course this only applies to texts written entirely in standard English that don't use any unusual glyphs. Take any other language on earth or any English text using ligatures or other non-ASCII glyphs and you need to transliterate and/or use lossy conversion. ASCII is only good for a rather large subset of English texts.

      Case in point: I'd comment on how you implied the universal quantifier when only the existential one applies but neither ASCII nor Slashdot's unique character set include either of them.

      --
      USE HOT GRITS WITH STATUE OF NATALIE PORTMAN (NAKED AND PETRIFIED)
    25. Re:Project Gutenberg by mikechant · · Score: 1

      So being able to code using variable names written in your own language, vs. transliterating them into Latin characters is a huge benefit...

      A benefit only as long as you are sure that the code will only be worked on by people who share your language. In these days of outsourcing and trans-national development, the minimum common language is highly likely to be English - most programmers will have at least some experience with code where everything including comments and variable names is in English. You'll find it much harder to (say) find Indian programmers who are reasonably happy with (e.g.) Danish comments and variable names.

    26. Re:Project Gutenberg by shutdown+-p+now · · Score: 1

      As I recall, the Windows headers are all UTF-16

      They're not. I've never in my life seen an UTF-16 C/C++ header or source file. It may well be that VC++ can handle them, just never saw one.

      Someone with mod points and on crack is around, by the way. Troll, really?

    27. Re:Project Gutenberg by Anonymous Coward · · Score: 0

      Yes?

    28. Re:Project Gutenberg by snowgirl · · Score: 1

      Ah, thanks for the info. :)

      --
      WARNING! This girl exceeds the MAXIMUM SAFE standards established by the FDA for BRATTINESS
    29. Re:Project Gutenberg by DragonWriter · · Score: 1

      In any event, it would make good sense for programming environments to be able to handle Unicode source.

      At the same time, it would be sensible for programming languages not to require code to use characters that are not easily entered from common keyboard layouts.

      Note that this is exactly the state already with some common programming languages: e.g., Ruby from 1.9 accepts source in any of a wide variety of encodings, including the various Unicode encodings, and can accept identifiers in characters that aren't available in ASCII.

      But none of the standard syntax or identifiers in the core or standard-library use such characters.

  4. huh by stoolpigeon · · Score: 3, Insightful

    so we should start coding in Chinese?

    Seems easier to spell words with a small set of symbols than to learn a new symbol for every item in a huge set of terms.

    --
    It's hard to believe that's how Micronians are made. Why don't we see it right now by having you both kiss one another?
    1. Re:huh by MightyYar · · Score: 4, Insightful

      so we should start coding in Chinese?

      Exactly! Keep the "alphabet" small, but the possible combination of "words" infinite.

      You don't need a glyph for "=>" for instance. Anyone who knows what = and > mean individually can discern the meaning.

      And further (I know, why RTFA?):

      But programs are still decisively vertical, to the point of being horizontally challenged. Why can't we pull minor scopes and subroutines out in that right-hand space and thus make them supportive to the understanding of the main body of code?

      This is easily done with a split screen, and sounds like an editor feature to me. Not sure why you'd want a programming language that was tied to monitor size and aspect ratio.

      Why not make color part of the syntax? Why not tell the compiler about protected code regions by putting them on a framed light gray background? Or provide hints about likely and unlikely code paths with a green or red background tint?

      Again, if you want this, do it in the editor. Doesn't he know anyone who is colorblind? And even a normally sighted user can only differentiate so many color choices, which would limit the language. And forget looking up things on Google: "Meaning of green highlighted code"... no wait "Meaning of hunter-green highlighted code" hmmmm... "Meaning of light-green highlighted code"... you get the idea.

      --
      W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.
    2. Re:huh by jonbryce · · Score: 2, Interesting

      No, but I think the idea of being able to draw flowcharts on the screen and attach code to each of the boxes could be an idea that has mileage.

    3. Re:huh by Kjella · · Score: 1

      Let me take an example, in Norwegian year = år. That means that for a billing system it might be fully resonable to have classes related to financial years (finansår), close of year (årsavslutning), the tax report (årsoppgave) and so on. In practice everybody sticks to A-Z, but it's a system limitation not a natural one.

      --
      Live today, because you never know what tomorrow brings
    4. Re:huh by CensorshipDonkey · · Score: 4, Interesting

      Have you ever used a visual diagrammatic code language before, such as LabView? Every scientist I've ever met that had any experience writing code vastly prefers the C based LabWindows to the diagrammatic LabView - diagrammatic is simply a fucking pain in the ass. Reading someone else's program is an exercise in pain, and they are impossible to debug. Black and white, unambiguous plain text coding may not be pretty to look at but it is damn functional. Coding requires expressing yourself in an explicitly clear fashion, and that's what the current languages offer.

    5. Re:huh by gman003 · · Score: 1

      Been done. Extremely inflexible. Not all that fast. Only liked by teachers (easier than teaching proper flow control) and managers (who can only read flowcharts anyways).

    6. Re:huh by yuje · · Score: 1

      We don't need to start coding in Chinese, but the Chinese should certainly be given the option of coding in Chinese. Not everyone is equally gifted at learning foreign languages (nor may they be given the opportunity or practice to), but they may yet be brilliant programmers. Being able to draw upon more programmers from a population of a bilillion plus by giving them the ability to program within their native writing has the potential to unleash a lot of creativity into the field. If only 1% of the population had the potential to be good programmers but perhaps not in English, and only 1% of those turn out to be brilliant programmers, we'd still be unleashing a hundred thousand of them into the world, and who knows what kind of ideas they would have, given the opportunity to express them differently than traditional programming languages?

    7. Re:huh by ScrewMaster · · Score: 5, Insightful

      diagrammatic is simply a fucking pain in the ass.

      Amen.

      Every scientist I've ever met that had any experience writing code vastly prefers the C based LabWindows to the diagrammatic LabView

      Well, I'm not a scientist, just a humble software engineer, and back in my contract coding days I was always faced by managers that would try to push me to use LabView. They had this mistaken belief that because it was "visual" they could a. understand it and b. thought it was simpler and c. thought I should charge less if I used it.

      I told them that a. it's still programming, and beyond a certain level of complexity understanding still requires sufficient knowledge and b. refer to a. and c. if they were going to force me to waste time fighting such an environment up 'til the point where I found something critical that it couldn't do (such as run fast enough) and would end up re-coding the right way anyway, they damn well weren't going to pay me less.

      --
      The higher the technology, the sharper that two-edged sword.
    8. Re:huh by mr_mischief · · Score: 3, Insightful

      Let someone who reads and writes Chinese develop a programming language with Chinese keywords and syntax, then. Programming in English-like languages has largely been a waste of time, remember. English keywords are great, but using English syntax for a programming language is a nightmare. Everyone uses a syntax that's simpler than English. Even Perl's grammar is simpler than English, and that grammar is massive compared to most programming languages.

    9. Re:huh by tftp · · Score: 4, Insightful

      it might be fully resonable to have classes related to financial years (finansår), close of year (årsavslutning), the tax report (årsoppgave) and so on.

      And one day the code is sold to China or India, and then people there can't even find a way to enter the glyph. Same if a visiting programmer has to work on the code, or if you need to send a class to another country for some reason.

      How far Linux would get if Linus decided to use Finnish (or Swedish) words written with all the proper UNICODE characters for all the variables and types?

    10. Re:huh by icebraining · · Score: 1

      ChinesePython.

      What is ChinesePython ?
      ChinesePython is a sort of translation work of the Python language into chinese. The most notable changes to the python language is that its keywords, variable names, builtin types and their methods are all tranlated into chinese. That enables a programmer to write chinese programs in python's style.

      Personally, while it might be useful for learning programming, I think it's stupid to write in any language other than English. English is the current lingua franca just like Latin once was, and if you want to join the world's programming community, you need to learn it.

      I'm Portuguese, and I wouldn't ever think about programming in Portuguese, nor using accents in functions/variables names. It's almost rude for any project that involves other people.

    11. Re:huh by Vegemeister · · Score: 1

      Kill it with fire.

    12. Re:huh by mywhitewolf · · Score: 1
      The English language is a great communication tool for extremely powerful parallel processing bio-organisms to convey complicated concepts in a lossy format so that original meaning is inferred through experience and "most likely meaning". modern written language is an attempt to digitalis a mostly analogue information transmission mechanism.

      programming is a method of digitally communicating simple yet highly specific ideas to a digital entity. we really shouldn't be focusing on trying to make programming languages easier to read for non technical people. the focus should be making programming more logical so that language is not as much of an issue.

      Not everyone is equally gifted at learning foreign languages (nor may they be given the opportunity or practice to), but they may yet be brilliant programmers.

      i don't think you would be at a significant enough disadvantage to learn the symbolic meaning of certain character groups then if you had to learn programming from scratch by learning that an "English like word" as a particular meaning to the computer.

    13. Re:huh by bh_doc · · Score: 3, Insightful

      As a scientist who has a fair bit of coding experience, including LabVIEW, ++ this.

      What particularly annoys me about visual code like LabVIEW is that you can't diff. So change tracking is a pain in the arse, and forget distributed development.

      LabVIEW itself is good for setting up a quick UI and connecting things to it, but any serious processing? ...No, thanks. If I could get my hands on something else that had the UI prototyping ease, connectivity to experimental devices (motion controllers, for example), but based on a textual language, I'd be a happy camper. (There are some things that come close, I'm sure, though I've not had the time to properly search. Busy scientist is busy...)

    14. Re:huh by Anonymous Coward · · Score: 0

      so we should start coding in Chinese?

      There are some characters in Chinese that you can not even type on a computer. So, I recommend that we don't let ourselves be shackled by the limitations of Unicode and use those characters in our new programming languages. While we are at it, we can attach our programs to carrier pigeons and fly them across the world to the compilers who will check them for proper penmanship and syntax errors, and shoot fireworks in the sky to indicate whether or not or program compiled successfully.

      On the other hand, we could just type ASCII since that works just as well but is much easier.

       

    15. Re:huh by Anonymous Coward · · Score: 0

      However, you would do well to improve your use of whitespace.

    16. Re:huh by wiredlogic · · Score: 1

      It always befuddles me that software engineers get hardons over the latest fad 2D diagramming tools when in the electrical engineering world there has been a move away from schematics to HDLs for digital logic design. With 2D diagramming you waste a lot of time maintaining the layout and rearranging things for visual presentation. Vendor lock-in is much more of a problem than with standardized code. For smaller problems diagramming has its merits but it doesn't scale up as easily as code does.

      --
      I am becoming gerund, destroyer of verbs.
    17. Re:huh by burkmat · · Score: 1

      Not to mention the differences between monitors.

      At my previous job we differentiated different tickets through a coloring scheme. These colors were pretty much unique (luckily there weren't that many tickets), but still helped a lot. It's quite effective just using basic pattern matching against the title of the ticket and the background color.
      Every now and then, a monitor would go dead and be replaced - Suddenly all the colors had shifted, ranging from miniscule amounts to red going dark brown. Color might be a good idea when you don't actually need to reference anything by it. Documenting colors would be impossible.

    18. Re:huh by stephanruby · · Score: 1

      May be, he's talking about Google's App Inventor, it's more like sets of blocks, drawings, and numbers (yes, numbers are still used), and (except for some of the labels) they don't seem to map out to any human-readable ascii source code (so in that sense, the drawings are the source).

    19. Re:huh by MaskedSlacker · · Score: 1

      Nuke it from orbit.

    20. Re:huh by jimjamjoh · · Score: 1

      Coding requires expressing yourself in an explicitly clear fashion, and that's what the current languages offer.

      I take it you've never tried to read Perl...

    21. Re:huh by Anonymous Coward · · Score: 0

      How far Linux would get if Linus decided to use Finnish (or Swedish) words written with all the proper UNICODE characters for all the variables and types?

      not very far, since gcc doesn't accept unicode in variable or type names

    22. Re:huh by pjt33 · · Score: 1

      I have a Java program that I wrote in Spanish, and some of the classes had accented characters in their names - Java is fully Unicode already. I had to change them because some filesystems weren't handling the files properly. So language support isn't the only thing that's necessary.

    23. Re:huh by cronius · · Score: 1

      Oh god, please don't program in non-english. I'm a Norwegian myself and currently working on (perl) code that uses Norwegian for comments, sub names and variables. The problem is that the perl API and keywords are of course english (like any other programming language), so at best you get a horrible mix of English and native (in this case Norwegian). It's much much clearer to read code that is consistently written in *one* language (programming is communication after all).

      The inconsistency and constant parsing the brain has to do (is this Norwegian or English? does this variable name make more sense in Norwegian or English?) is an energy drain, not to mention annoying. This is true for any mix of languages I think (that uses the same alphabet/characters).

      Plus, as others have pointed out, using native language in the code is all nice and good until someone that doesn't speak that language wants or needs to look at the code. In IT business, using non-native consultants or even outsourcing a programming task isn't uncommon. Heck, at once the company goes international you end up with the problem of having to translate everything from native to english anyway.

      --
      Life is Reality
    24. Re:huh by Timmmm · · Score: 1

      Labview is 'visual programming' done in about the worst way possible. Even the UI is badly done.

      I think something more like scratch would work, where instead of using ASCII sequences to represent tokens and blocks, graphical objects can be snapped together like a jigsaw. The code still looks something like a normal program, instead of a tangled mess of multi-layered balls of string like Labview. And there's no possibility for syntax errors, type mismatches, etc. The compiler would be much faster and simpler because it already has the AST loaded.

      Of course it has downsides. Typing is fast, and easy to share online. It works well with VCSs. Still, I'd like to see a serious programming language done in that style (one could easily do C for example).

    25. Re:huh by ScrewMaster · · Score: 1

      However, you would do well to improve your use of whitespace.

      Well, I'm not a scientist, just a humble software engineer, and back in my contract coding days I was always faced by managers that would try to push me to use LabView. They had this mistaken belief that because it was "visual" they could:

      a. understand it,

      b. thought it was simpler,

      c. thought I should charge less if I used it.

      I told them that:

      a. it's still programming, and beyond a certain level of complexity understanding still requires sufficient knowledge,

      b. refer to a.,

      c. if they were going to force me to waste time fighting such an environment up 'til the point where I found something critical that it couldn't do (such as run fast enough) and would end up re-coding the right way anyway, they damn well weren't going to pay me less.

      Is that better? You'd have done well as one of those managers.

      --
      The higher the technology, the sharper that two-edged sword.
    26. Re:huh by AB3A · · Score: 1

      Agreed. In a broader application, Function Block Diagramming is one of the silliest features in IEC 611131 (the common interface specification for Programmable Logic Controllers). Not much better: Relay Ladder Logic diagrams. I know, this is heresy among many people who "believe" in this stuff.

      I want my program text to appear in a concise, easy to read format. The logic in RLL requires too much scrolling and is simply too diffusely displayed on the screen. People who use this stuff often forget that someone with a laptop may be standing in front of the controller in the field, wondering why it is behaving the way it is. Having to scroll across many screens just to see a single line of logic doesn't help understand what the code is supposed to do.

      A simple text based line of logic equations would be much easier to parse. However, old habits die hard on the plant floor. There is still this notion that real electricians will want to look at this code and that all those mysterious look-alike blocks with functions that range from timers to shift registers will be comprehensible to them.

      Stick to ASCII. Stick to the stuff that everyone knows how to read. Displays and code can be made pretty with a good text editor, but the original software needs to comprehensible by mere mortals.

      --
      Nearly fifty percent of all graduates come from the bottom half of the class!
    27. Re:huh by Sal+Zeta · · Score: 1

      It's commonly done with multimedia programming languages for audiovisual generation. See tools like puredata, Max/Msp, or vvvv

    28. Re:huh by James+McGuigan · · Score: 1

      Mod parent up

    29. Re:huh by mypalmike · · Score: 1

      > No, but I think the idea of being able to draw flowcharts on the screen and attach code to each of the boxes could be an idea that has mileage.

      There are quite a few such things:

      http://en.wikipedia.org/wiki/Visual_programming_language

      The one I am most familiar with is Prograph, now apparently "Marten" (see www.andescotia.com). It's neat in concept, but somehow you end up with code that's harder to "read" than C.

      --
      There are 0x40000000 types of people: those who understand 32-bit IEEE 754 floating point, and those who don't.
    30. Re:huh by Jesus_666 · · Score: 1

      You don't need a glyph for "=>" for instance. Anyone who knows what = and > mean individually can discern the meaning.

      Of course this can lead to clashing conventions. Most programmers would agree that "a--" means to decrement the variable a. Many TeX users would agree that it means there's an en-dash behind the "a".

      Unicode in control characters might make sense in very specific circumstances. I cite the program suite "Language, Proof and Logic", which is used in university logic courses. You use special, program-specific keybindings to type quantifiers not easily found on any standard keyboard. There's a learning curve but in the context of the programs it's much better to have access to domain-specfic glyphs like quantifiers than to use your imaginaton and just pretend that "\-/" is the universal quantifier. And since the set of required domain-specific characters is rather small it's not a big problem to get used to the programs.

      Of course you could do like IMEs do and replace certain n-graphs with the appropriate domain-specific glyphs as the user types them. Or, alternatively, store them as n-graphs but display them as domain-specific glyphs. Anyway, I submit that in domains with well-established glyphs it might be a good idea to actually use those glyphs instead of making up kind-of-similar ASCII n-graphs and trying to establish those as a parallel standard.

      --
      USE HOT GRITS WITH STATUE OF NATALIE PORTMAN (NAKED AND PETRIFIED)
    31. Re:huh by Jesus_666 · · Score: 1

      I am not certain that "you might at one day sell your code to a foreign company that is somehow unable to do a global search-and-replace" should limit your design. You should write code appropriate to your domain. If your domain happens to be the Norwegian tax code it seems reasonable to use proper terminology (especially since it seems unlilkely that China is going to adopt the Norwegian tax code).

      Or is it "never code something that might be inconvenient to foreign coders"? That would point to C being badly designed as it uses tons of curly braces and square brackets which are rather inconvenient to type on a German QWERTZ keyboard. Ritchie really should've restrained himself.

      --
      USE HOT GRITS WITH STATUE OF NATALIE PORTMAN (NAKED AND PETRIFIED)
    32. Re:huh by rjstanford · · Score: 1

      I once worked with a programmer who thought is was super l33t and "correct" - he used class names like StupidFaçade (although his were all more like ReallyCleverFaçade). Worst. Code. Ever. If you weren't using autocomplete it really sucked, and if you were it was totally unimportant. Things like find-and-replace though, where autocomplete wasn't available, generally forced me into cut-and-paste hell.

      --
      You're special forces then? That's great! I just love your olympics!
    33. Re:huh by tftp · · Score: 1

      I am not certain that "you might at one day sell your code to a foreign company that is somehow unable to do a global search-and-replace" should limit your design. You should write code appropriate to your domain.

      If you write the code that is appropriate for your domain then one day some small company like SAP approaches you for acquisition, does its due diligence and then its coders scream bloody murder. The acquisition falls through. That would be kind of a high price to pay for a few keystrokes. Businesses exist to make money, not to multiply risks for no good reason.

      Global search and replace is a risky thing; I did my share of refactoring and know that firsthand. This is particularly difficult when several languages are mixed (like assembly and C, which is common in embedded systems.) You need to be sure that each and every replacement is not already present in the scope, for example (but they may be allowed outside of the scope.) If you are selling a million LOC codebase (which is not something unique when you are being acquired) such refactoring in itself is a major project. Do you want to give the buyer a good reason to drop the price by 10% or 20% ? Probably not, if you are the owner. Business owners like to play it safe.

      curly braces and square brackets which are rather inconvenient to type on a German QWERTZ keyboard

      Buy a QWERTY keyboard for $10 (or whatever Euro equivalent,) plug it in and enjoy. You won't need too many ü, ä or ö in the C code, and if you do (in comments) then map them to something else. It's not that hard to remember keys for three characters. I do a similar thing myself for U+0451 and U+044a (they are mapped to Alt-Ctrl-8 and Alt-Ctrl-0.)

    34. Re:huh by Spykk · · Score: 1

      You don't need a glyph for "=>" for instance. Anyone who knows what = and > mean individually can discern the meaning.

      "=>" is not the same thing as ">=". In C#, for example, "=>" is the lambda operator.

    35. Re:huh by DragonWriter · · Score: 1

      As a scientist who has a fair bit of coding experience, including LabVIEW, ++ this.

      What particularly annoys me about visual code like LabVIEW is that you can't diff. So change tracking is a pain in the arse, and forget distributed development.

      This no doubt may be a problem with LabVIEW, but since the visual "code" shown in the interface has to be linearized somehow to be processed, there's no reason an environment with a visual code interface couldn't support diff (and translate the linear diff into, e.g., highlighting, etc., on the visual code display.)

      Visual environments are inherently just interfaces to underlying linear code, and the only reason they don't support equivalents of tools that exist for linear code is because no one has bothered to develop the equivalent of those tools.

      Of course, different people have different degrees of proficiency with visual environments vs. linear code, so it would certainly be good for any visual environment to also provide direct access to the linear code as well, with no bias as to input mechanism (that is, anything that is syntactically valid in the linear representation should be representable in the visual environment, and vice versa.)

    36. Re:huh by Bigjeff5 · · Score: 1

      Why can't you use paragraphs like normal people, so non-programmers can understand you? ;)

      --
      Security is mostly a superstition... Avoiding danger is no safer in the long run than outright exposure. - Helen Keller
    37. Re:huh by Anonymous Coward · · Score: 0

      Perseestä jos tiedän

    38. Re:huh by ScrewMaster · · Score: 1

      Why can't you use paragraphs like normal people, so non-programmers can understand you? ;)

      {sigh}

      --
      The higher the technology, the sharper that two-edged sword.
  5. Learn2code by santax · · Score: 4, Insightful

    I can express my intentions just fine with ASCII. They have cunningly invented a system for that. It's called language and it comes in very handy. The only thing I would consider missing is a pile of shit-character. I could use that one right now.

    1. Re:Learn2code by Anonymous Coward · · Score: 0

      A pile of shit character is available in Unicode 6.0.

    2. Re:Learn2code by MightyYar · · Score: 3, Funny

      You mean "@"? Looks like a pile of shit to me.

      --
      W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.
    3. Re:Learn2code by santax · · Score: 2, Informative

      Oh crap... I guess you can forget about my earlier comment. I'm adopting unicode as we speak! U+1F4A9 ftw.

    4. Re:Learn2code by Anonymous Coward · · Score: 0

      Surely you mean U+2668, #9823;

    5. Re:Learn2code by Noughmad · · Score: 5, Funny

      I don't know about you, but I have a pile-of-shit key on my keyboard, right between the left Ctrl and Alt.

      --
      PlusFive Slashdot reader for Android. Can post comments.
    6. Re:Learn2code by ScrewMaster · · Score: 1

      I don't know about you, but I have a pile-of-shit key on my keyboard, right between the left Ctrl and Alt.

      Ha ... wish I had mod points. I'm typing this on an old Thinkpad R40, which thankfully doesn't have the pile-of-shit key. I've always loved the Thinkpad keyboard layout, and the fact that it's missing the POS key was just icing on the cake.

      --
      The higher the technology, the sharper that two-edged sword.
    7. Re:Learn2code by Arkham · · Score: 1
      --
      - Vincit qui patitur.
    8. Re:Learn2code by nanospook · · Score: 1

      Depends on your perspective, A*S*S*H*O*L*E! (not you though :)

      --
      Have you fscked your local propeller head today?
    9. Re:Learn2code by Anonymous Coward · · Score: 0

      I can't seem to find it on my Model M. There is just empty space there.

    10. Re:Learn2code by nu1x · · Score: 1

      Well, it also can be considered a pile of shit (as viewed by eagle, a noble bird, in flight).

      --
      I have nothing to lose but my bindings.
    11. Re:Learn2code by isorox · · Score: 2, Interesting

      I don't know about you, but I have a pile-of-shit key on my keyboard, right between the left Ctrl and Alt.

      It's a very useful "meta" key. Aside from controlling my music from amarok, I have a variety of mappings set up, Meta-s shades the window I'm using, Meta-R pops up a run dialog, Meta-CapsLock pops up an rxvt terminal window, Meta-F4 runs xrandr --auto and reconfigures when I plug in an external monitor.

      (Capslock itself is mapped to Escape, which I find a lot easier on the wrists on my laptop than using the real escape key -- I rebound it about 5 years ago when my escape key broke and haven't looked back)

    12. Re:Learn2code by Anonymous Coward · · Score: 0

      That key used to crash 3dStudio R4, losing all the artists' work. So they used to physically rip the key from the keyboard.

    13. Re:Learn2code by Late+Adopter · · Score: 1

      That's the SHIFT key, not shit. Oh, wait, I bet you have one of those new-fangled "Caps-lock" keys.

    14. Re:Learn2code by vegiVamp · · Score: 1

      This. Exactly this.

      I don't want to have to remember the difference between set, sét and sèt.

      --
      What a depressingly stupid machine.
    15. Re:Learn2code by miknix · · Score: 1
    16. Re:Learn2code by Bigjeff5 · · Score: 1

      You didn't look at your keyboard before typing that, did you?

      On keyboards that have a key between ctrl and alt it's the Windows key. The vast majority of keyboards have them, since the vast majority of computers are Windows machines, and it's a really handy key for Windows shortcuts.

      In case you're still too slow to get it (which wouldn't surprise me, since you seem to think the alt key is somewhere above the capslock key), it was a jab at Windows.

      --
      Security is mostly a superstition... Avoiding danger is no safer in the long run than outright exposure. - Helen Keller
  6. Yes, Unicode is "the new black" by Antique+Geekmeister · · Score: 2, Informative

    Yes, it's the next fad that just _everyone_ has to wear. this season. Within 5 years, it will be something else, and given the ability of major vendors like Microsoft to get Unicode _wrong_, it's not stable for mission critical applications. If you want your code to remain parseable and cross-platform compatible and stable in both large and small tools, write it in flat, 7-bit ASCII. You also get a significant performance benefit from avoiding the testing and decoding and localization and most especially the _testing_ costs for multiple regions.

    Look up "microsoft unicode error" on Google for hundreds if not thousands of examples. ASCII for code is like flat text for email. It assures that you're not simply publishing coding spam, and actually wrote what you meant.

    1. Re:Yes, Unicode is "the new black" by shutdown+-p+now · · Score: 2, Insightful

      Yes, it's the next fad that just _everyone_ has to wear. this season. Within 5 years, it will be something else

      Unicode has been around for, what, over 15 years now? It's part of countless specifications from W3C and ISO. All modern OSes and DEs (Windows, OS X, KDE, Gnome) use one or another encoding of Unicode as the default representation for strings. No, it's not going away anytime soon.

      If you want your code to remain parseable and cross-platform compatible and stable in both large and small tools, write it in flat, 7-bit ASCII.

      This may be a piece of good advice. Even for languages where Unicode in the source is officially allowed by the spec (e.g. Java or C#), many third-party tools are broken in that regard.

      You also get a significant performance benefit from avoiding the testing and decoding and localization and most especially the _testing_ costs for multiple regions.

      I don't see how this has any relevance to your previous point (writing the source code in ASCII). If your app source is in Unicode, it will still compile (or not compile) the same on any locale. And what would you be you testing? The compiler?

      I've no idea what "decoding and localization" means in this context, either.

      Well, unless you're also advocating for the use of ASCII as the default runtime string encoding in apps, and completely forgoing localization. Which is fine if you only intend your app to be used in the USA, I guess (and even then, considering take-up of Spanish, it may not be such a wise idea).

    2. Re:Yes, Unicode is "the new black" by Anonymous Coward · · Score: 2, Insightful

      "Yes, it's the next fad that just _everyone_ has to wear. this season."

      Like the Metric System.

    3. Re:Yes, Unicode is "the new black" by petermgreen · · Score: 1

      and most especially the _testing_ costs for multiple regions.
      Heh you still need to test on multiple language versions of your OS even if all your text is 7-bit ascii. For example you need to figure out where you will be using the local conventions for decimal seperators and where you will be using the dot and make sure you use the right conversion routines in the right place. Failure to do this will lead to software that works fine on english systems but may break on continental european ones.

      --
      note: i'm known as plugwash most places but i screwd up registering that here somehow in the past and now can't register
    4. Re:Yes, Unicode is "the new black" by scdeimos · · Score: 3, Insightful

      Unicode has been around for, what, over 15 years now? It's part of countless specifications from W3C and ISO. All modern OSes and DEs (Windows, OS X, KDE, Gnome) use one or another encoding of Unicode as the default representation for strings. No, it's not going away anytime soon.

      And yet major vendors like Microsoft still get Unicode wrong. A couple of examples:

      • Windows Find/Search cannot find matches in Unicode text files, surely one of the simplest file formats of all, even though the command line FIND tool can (unless you install/enable Windows Indexing Service which then cripples the system with its stupid default indexing policies). This has been broken since Windows NT 4.0.
      • Microsoft Excel cannot open Unicode CSV and tab-delimited files automatically (i.e.: by drag-and-drop or double-click from Explorer) - you have to go through Excel's File/Open menu and go through the stupid import wizard.
      • Abuse of Unicode code points by various Office apps, causing interoperability issues even amongst themselves.
    5. Re:Yes, Unicode is "the new black" by shutdown+-p+now · · Score: 1

      It's no surprise that Unicode is more complicated than ASCII - given that it solves a much bigger problem. It's also no surprise that the consequence is that more people make more mistakes. Even so, as a non-English-speaking invidivual, I can tell you that a Unicode-aware application with some problems is far better than a perfect but non-Unicode-aware one.

    6. Re:Yes, Unicode is "the new black" by Bomarc · · Score: 0

      MS Excell also has other ... issues. Try sorting with ASCII some time. Which brings up a related topic: Everyone (okay, not Microsoft/excel) can sort find/compare with ASCII. Unicode? I spend two days fighting postgresql trying to get unicode chars to match.

    7. Re:Yes, Unicode is "the new black" by camperdave · · Score: 1

      Here's the thing though, unicode and ASCII are two different tools for two different jobs. Unicode is a map of alphabet and symbols to a byte code. ASCII is a tool for information interchange. It has facilities for managing communications, controlling printing, and making sounds.

      --
      When our name is on the back of your car, we're behind you all the way!
    8. Re:Yes, Unicode is "the new black" by Grail · · Score: 1

      Good luck selling your software to half the population of the USA (much less the rest of the world) when the only language you can use is American English.

      Internationalisation and localisation are separate to the issue of source code being able to represent variable names and string constants written in Unicode, eg: "écrire" rather than "ecrire" will keep your French users happier. A variable named fichier_à_écrire might make more sense to a French programmer too (I'm only using Google translate here, I have no idea how to correctly translate "output file").

      To insist that the French programmer uses "fichier_a_ecrire" instead would be equivalent to insisting that English speaking programmers use "1" for lower-case L, upper-case I, and the shell "pipe" - hey, they all look the same anyway, right? H0w d1ff1cu1t d0e5 1t need t0 be bef0re y0u rea1i5e that 11m1t1ng character 5et5 15 a 51gn1f1cant hand1cap?

    9. Re:Yes, Unicode is "the new black" by shutdown+-p+now · · Score: 1

      Unicode is a map of alphabet and symbols to a byte code.

      So is ASCII. It just maps much fewer symbols.

    10. Re:Yes, Unicode is "the new black" by Anonymous Coward · · Score: 0, Informative

      Windows Find/Search cannot find matches in Unicode text files, surely one of the simplest file formats of all, even though the command line FIND tool can (unless you install/enable Windows Indexing Service which then cripples the system with its stupid default indexing policies). This has been broken since Windows NT 4.0.

      You cannot find in files at all in Windows 7's Explorer without indexing enabled: it's 100% broken. All it shows is how much Microsoft cared about fixing the non-default configuration, which is to say, they didn't care. You've only shown the responsible MS team's ineptitude, not some greater impossibleness of proper Unicode handling.

    11. Re:Yes, Unicode is "the new black" by Anonymous Coward · · Score: 0

      Unicode is a map of alphabet and symbols to a byte code.

      So is ASCII. It just maps much fewer symbols.

      That's only part of ASCII.

    12. Re:Yes, Unicode is "the new black" by shutdown+-p+now · · Score: 1

      It is the only part of ASCII that matters in the context of this discussion. From TFA it is clear that "ASCII" here refers to the fairly limited character set, and nothing else.

    13. Re:Yes, Unicode is "the new black" by Deorus · · Score: 1

      As another non-English speaker I can tell you that I've had more trouble with Unicode which is supposed to solve problems than I've ever had with ASCII, latin1, and latin15 exponentially combined.

    14. Re:Yes, Unicode is "the new black" by shutdown+-p+now · · Score: 2, Interesting

      From your reference to Latin-1, I suspect you're from Western Europe, then. If so, then you guys didn't have it all that bad - most non-Unicode-aware apps are not truly ASCII (since we don't have 7-bit bytes around), and so the default encoding more often than not is Latin-1. Even if Americans mostly use it for "funny chars" like special quotation marks etc, you end up with a bunch of useful symbols as well. And your text doesn't end up all garbled.

      For folks from Eastern European countries, especially those with non-Latin-based alphabets - like mine - it's a rather different story. Extrapolating that, it must really suck for people with more "exotic" requirements, like Arabic or Chinese...

    15. Re:Yes, Unicode is "the new black" by keeboo · · Score: 1

      That's a good analogy you used in English: "1" for "l", "o" for "0", "S" for "$" and so on.

      Still, I find problematic to use non-ASCII characters in variable names for a number of reasons:
      - Unless you're really, really sure only native speakers will maintain the code, I think that using anything except English is a bad idea (consider that I'm neither a native English speaker, nor I love that language -- I'm just being practical, unfortunately Esperanto is not widespread).
      - Different languages treat accented characters differently. For example: Portuguese considers accented letters the same letters for "letter naming" and of sorting (even cedilla is considered a variant of "c"). Spanish (Castillan) consider certain accented letters distinct, while not others. Polish consider accented letters diffently and even clusters of letters are characters on their own (like "rz", "szcz" etc).
      - Certain - completely different - accents looks too alike, or are simply confusing for non-speakers (like Romainian cedilla-like accents, or Slovak soft "L" which looks like a "l"+apostrophe.
      - You can produce the same accented letter using different Unicode code sequences.
      - Certain characters from different scripts look alike, if not exactly the same: Cyrillic "N" and Latin "H" for example.
      - You may even understand the script, but are you able to type that? I can type French easily with my keymap, but a french keymap is useless for me, for example.

    16. Re:Yes, Unicode is "the new black" by icebraining · · Score: 5, Insightful

      I'm Portuguese and our language uses accents, but if I ever get a source code file with accents in variable names I'll insult the person. Writing with accents in programming serves absolutely no purpose and it only causes problems. It's slower (two key presses instead of one), it's less compatible, it can be troublesome if I need to send the code to someone without accents in the keyboard, etc.

      In fact, not only I disagree with accents in programming, but I prefer writing all the names in English. Where would OSS be if all the Gnome devs had to learn Spanish to contribute to De Icaza's code, or Finnish to contribute to Linux?

    17. Re:Yes, Unicode is "the new black" by Anonymous Coward · · Score: 0

      It is so much fun getting code from my overseas colleagues with comments that are a bunch of garbage characters.

      I can't even ask the Korean guys to translate it for me, because my IDE inevitably does ASCII-to-UTF8 conversion on the already UTF-8 characters, leaving me with leaning 'A's, 'A', with circumflex accents, Euro symbols, and other uselessness that is surprisingly hard to convert back to Korean characters.

    18. Re:Yes, Unicode is "the new black" by Fareq · · Score: 1

      Unless you are planning on putting a bunch of BEL chars in your code?

    19. Re:Yes, Unicode is "the new black" by shutdown+-p+now · · Score: 1

      In all places I worked at where employees didn't share native language (even if it was just the odd few), coding style required English identifiers and comments. But when the team is entirely Greek, or Russian, or Japanese, and the product they make doesn't really make sense outside of their native market, it's perfectly reasonable to use comments in one's native language.

      As for your IDE - file a bug against it. A text editor which cannot handle UTF-8 in 2010 is broken, period.

    20. Re:Yes, Unicode is "the new black" by Anonymous Coward · · Score: 0

      English is the 1st or 2nd most commonly spoken language on the planet if you include nonnative speakers.

      Rather, good luck selling your software if the source isn't in english.

    21. Re:Yes, Unicode is "the new black" by paedobear · · Score: 1

      You forget to add that languages change their sort order / consideration of ligatures being letters from time to time. German and Swiss German have both - iirc - promoted/demoted letters and changed sorting within the past 10 or 15 years

    22. Re:Yes, Unicode is "the new black" by Jesus_666 · · Score: 1

      Good point. Much of localization involves rethinking some of your assumptions. For instance, OpenOffice stubbornly insists (whether AutoCorrect is on or off) that a string like "5/7" must always denote a date even though I'm in Germany, using a German locale and we never use slashes in dates over here, much less in the format M/D/Y. That's localization and no amount of not supporting the local character set will save you from it.

      (I really like the argument that using ASCII saves you from having to test your system for multiple regions. It's true; if your software can't easily be ported to different regions you won't sell it abroad and thus won't have to test it...)

      --
      USE HOT GRITS WITH STATUE OF NATALIE PORTMAN (NAKED AND PETRIFIED)
    23. Re:Yes, Unicode is "the new black" by Chowderbags · · Score: 1

      Long term we're probably better off with one of the systems of natural units (Planck units are probably the best choice). SI units are based off very Earth-centric measurements (the meter being 1/10,000,000th of the distance from the North Pole to the Equator. Mass and temperature based off of water. Most egregiously the candela is based off of the burn properties of a particular type of candle.), so when we meet aliens, the units be seen as just as quaint as most of the world sees US Customary Units now.

    24. Re:Yes, Unicode is "the new black" by skastrik · · Score: 1
      The practical reason for using ascii only is interoperability between tools that deal with the source (humans with keyboards being one of the tools).

      That said, I have experimented a couple of times with national non-ascii variable names, and I think that such programs are actually easier to read and more fun to write.
      In particular when the problem domain is highly national it often is counter-productive to try to invent english names for variables, since these names would not be the ones used by the customer or related documentation.
      So my programs use ascii for variable names (obvioulsy) but often in a mix of english and my own language, the ratio somewhat depending on the problem. Even standard english actions such as get/set/create... might be followed be national words.

    25. Re:Yes, Unicode is "the new black" by Anonymous Coward · · Score: 0

      Well, I'm Spanish, and I can tell you accents are a no-no in variable names, all right; but we totally hate being unable to use "ñ" in our variables. It forces you to call your variables "ano" instead of "año". And "ano" in Spanish means "anus", in English. Some minorities of prudes use "anio", "anyo" and other strange contraptions, but the fact is that unless you are using an enlightened language like C#, all your years became anus. C++ programming in Spanish really stinks, I can assure you.

    26. Re:Yes, Unicode is "the new black" by Anonymous Coward · · Score: 0

      I'm Portuguese and our language uses accents, but if I ever get a source code file with accents in variable names I'll insult the person. [...] In fact, not only I disagree with accents in programming, but I prefer writing all the names in English.

      To the point, with reasons clearly expressed to back it up.

      You are my hero. :)

    27. Re:Yes, Unicode is "the new black" by Jonner · · Score: 1

      So, you're really saying that you can avoid a lot of hassle if you just stick to English and ignore all other languages. That has been the attitude of many programmers for decades, but it just isn't good enough any more.

    28. Re:Yes, Unicode is "the new black" by Anonymous Coward · · Score: 0

      Here's the thing though, unicode and ASCII are two different tools for two different jobs. Unicode is a map of alphabet and symbols to a byte code. ASCII is a tool for information interchange.

      You realize that the first 128 code points in Unicode are ASCII, right?

      Unicode is a (huge) superset of ASCII.

  7. We've tried this before by FeatherBoa · · Score: 4, Informative

    Everyone who tried to do something useful in APL, put up your hand.

    1. Re:We've tried this before by MichaelSmith · · Score: 1

      Everyone who tried to do something useful in APL, put up your hand.

      I never had access to the right keyboard.

    2. Re:We've tried this before by Anonymous Coward · · Score: 0

      I, but it is inevitably a waste of time. I get the feeling this guy is clueless about the reality of unicode. ASCII was invented out of necessity based on the lessons that unicode would have taught us in the long run, basically solving the problems before they were actual problems. This is simply amazing. Now stop wasting my time and get off my lawn.

    3. Re:We've tried this before by SimonInOz · · Score: 4, Interesting

      Incredibly, I worked for a major investment company who had, indeed, done something useful in APL. In fact they had written their entire set of analysis routine in it, and deeply interwoven it with SQL. I had to untangle it all. (Would you beleive they had 6 page SQL stored procedures? No, nor did I - but they did).
      APL is great sometimes - especially if you happen to be a maths whizz and good at weird scripts. Not exactly easy to debug, though. Sort of a write-only language.

      For the last ten plus years, we have been steadily moving in the direction of more human readable data - the move to XML was supposed to be a huge improvement. It meant you could - sort of - read what was going on at ever level. It also meant we had a common interchange between multiple platforms.

      So you want to chuck all that away to get better symbols for programming? No, I don't think so.
      I must point out that the entire canon of English Literature is written in - surprise - English, and that's definitely ascii text. I don't think it has suffered due to lack of expressive capability.

      What does supriose me, though, is how fundementally weak our editors are. Programs, to me, are a collection of parts - objects, methods, etc, all with internal structure. We seem very poor at further abstracting that - why, oh tell me why, when I write a simple - trivial - bit of Java code, do I need to write funtions for getters and setters all over the place - dammit, just declare them as gettable and settable - or (to keep full source code compatibility) the editor could do it. Simply ,easily, tranparently. And why can't the editor hide everything except what I am concerned with?
      Microsoft does a better job of this in C#, but we could go much, much further. We seem stuck in the third generation language paradigm.

      --
      "Cats like plain crisps"
    4. Re:We've tried this before by Anonymous Coward · · Score: 0

      Hand up! I wrote lots of useful code in APL. Sometimes I had an APL keyboard and (and the requisite APL typeball on the 2741!) and sometimes I had to use the ASCII mnemonics that the system provided for the people stuck with ASCII symbols. And that was in 1972.

    5. Re:We've tried this before by PolygamousRanchKid+ · · Score: 1

      In the 80's, my sister, a chemical engineer, wrote control systems for oil refineries in APL. That scared the hell out of me, and I was happy that I live on a different continent. On the other hand, back in the 80's when I was studying at the university, APL was very popular with math students. I guess they were used to dealing with all those crazy symbols. The biggest problem was that the brilliant, but absent minded types would work for an hour, and quit without saving the workspace.

      On another note, a prof told me that part of the reason IBM used that symbol set, was that it was trying to promote its "goofball" typewriter systems. For youse youngins', the typewriter had a golf ball sized "head" with all the symbols on it. It would rotate to appropriate symbol and whack that onto the paper. And the "goofball" had a simple clip that allowed you to easily and quickly change the "goofball." Hey, presto! New symbol set!

      --
      Schroedinger's Brexit: The UK is both in and out of the EU at the same time!
    6. Re:We've tried this before by lennier · · Score: 1

      What does supriose me, though, is how fundementally weak our editors are. Programs, to me, are a collection of parts - objects, methods, etc, all with internal structure. We seem very poor at further abstracting that

      Yes. But if the programming languages we use don't allow abstraction of repeated boilerplate, it's not the editor's fault, it's the language's.

      We have chosen to put up with and accept the faults of extremely weak languages, like C++ and Java, instead of expressive ones like Lisp. Why? I have no idea. Because we like pain, and unproductive tedium, I think. That's why I lost interest in mainstream programming years ago - language designers were apparently not interested in automating the process of writing programs, and instead decided to dump a whole lot of repetitive nonsense onto the programmers - and then when that became an obvious burden, onto the IDE and editor. Instead of fixing the languages to allow clear expression of ideas without repetition, they just layered more brokenness on top of shoddy foundations.

      Wake me up when language designers decide to do their jobs.

      --
      You are not a brain: http://books.google.com/books?id=2oV61CeDx-YC
    7. Re:We've tried this before by AnonymousClown · · Score: 1
      Lisp? That's still typing.

      I just can't get over the fact that we're still typing to program computers - that's soooooo twentieth century. Or even when we do have visual tools, it still generates code to be compiled.

      What really needs to be done is 100% visual directly to machine - no stops.

      --
      RIP America

      July 4, 1776 - September 11, 2001

    8. Re:We've tried this before by 644bd346996 · · Score: 2, Insightful

      Try reading an EULA and then come back and tell me that English is sufficiently expressive as-is.

    9. Re:We've tried this before by nanospook · · Score: 1

      Oh OH!! ME! I wrote at least a few programs back in comp sci.. I thought it was cool..

      --
      Have you fscked your local propeller head today?
    10. Re:We've tried this before by icebraining · · Score: 1

      why, oh tell me why, when I write a simple - trivial - bit of Java code, do I need to write funtions for getters and setters all over the place - dammit, just declare them as gettable and settable - or (to keep full source code compatibility) the editor could do it.

      For that particular problem, you can try Project Lombok.

    11. Re:We've tried this before by Anonymous Coward · · Score: 0

      Our editors are strong, our compilers are weak. I don't think an editor that constantly hints, hides or changes your input is good, it is rather disturbing to our thoughts. A good editor should stay at "print hello world" except maybe some highlighting. However, we need intelligent compilers that can parse that "one source code" at our thought level into optimized C code or perl code or whatever programming language is needed. We need compilers that can translate from pseudo code to real code.

    12. Re:We've tried this before by GigsVT · · Score: 1

      why, oh tell me why, when I write a simple - trivial - bit of Java code, do I need to write funtions for getters and setters all over the place - dammit, just declare them as gettable and settable - or (to keep full source code compatibility) the editor could do it. Simply ,easily, tranparently. And why can't the editor hide everything except what I am concerned with?

      It's not the editor's fault you are using a shit language.

      --
      I've had enough abrasive sigs. Kittens are cute and fuzzy.
    13. Re:We've tried this before by bh_doc · · Score: 1

      Oh come now, Legalese is to English as Pig Latin is to Latin.

    14. Re:We've tried this before by bastia · · Score: 1

      What editor are you using that doesn't generate your getters and setters for you? In Eclipse, for example, that would be Source > Generate Getters and Setters...

    15. Re:We've tried this before by Fareq · · Score: 1

      You know, there is actually a good reason to somewhat restrain expressiveness in programming languages.

      I love writing in a nice dynamic language too, because it inevitably means less typing, more doing.

      BUT, when I have to read someone else's code, the fact that the types of things have to be explicitly spelled out helps SO much.

      This is actually something I've seen as C# evolves. The new 'var' keyword was really originally intended to be used in specific situations where spelling out the full type is tedious and the type is relatively unimportant. Same reason for the new 'auto' in C++. You don't really want to type HashMap.Iterator iter at the beginning of a for loop... it's fairly obvious what you intended, and that's a hideous typename.

      But now that the keyword is there, it gets used for everything, and it becomes rather non-obvious when reading someone else's code what all of the types are. It becomes mandatory to have an IDE that will give you type info when you mouseover something, and you have to mouseover everything... because the types are all gone.

      It does save keystrokes, and makes it easier to type, but I really prefer having all of the typenames explicitly listed when I need to read something.

    16. Re:We've tried this before by SimonInOz · · Score: 1

      Of course eclipse WILL generate getters etc, but I have to ask. Also they get in the way all the time. We can do far better

      --
      "Cats like plain crisps"
    17. Re:We've tried this before by SimonInOz · · Score: 1

      >> It's not the editor's fault you are using a shit language.

      Hmm. The single most popular computer language on earth. Yup. Clearly shit.
      A bit like criticising English for the same reason. Get over yourself - of course there are new languages with more succinct syntax - but really, can we not progress what we have? It's clearly possible to have editors aid considerably in the development of programs. Handling getters and setters would be but one tiny step.
      How about using colour coding better - and holding a decent display of the current object. Allowing that model to be editable? Allowing some halfway decent object relationship modelling. The list is long. (Maybe UML could finally deliver on its promise - but I confess I doubt it).

      Saying "it's the language, stupid" is a waste of time. Yes, there are briefer languages - APL was unquestionably one of them. But it hasn't been very popular, has it? Indeed, the most popular language of all time, which was surely COBOL wasn't exactly brief, now was it?
      So instead of trying to change the language (Esperanto, anyone?(), let's make the tools better.

      --
      "Cats like plain crisps"
    18. Re:We've tried this before by cgenman · · Score: 2, Insightful

      Things I would love to see standard in all new editors:

      1. Little triangles that hide blocks of code unless you explicitly open and investigate them.
      2. Dynamic error detection. Give me a little underline when I write out a variable that hasn't been defined yet. Give a soft red background to lines of code that wouldn't compile. That sort of thing.
      3. While we're at-it, "warning" colors. When "=" is used in a conditional, for example, that's an unusual situation that should be underlined in Yellow.
      4. Hard auto-indent. It may be two spaces in the source code, but accidentally copying the indentation, and putting it in the wrong places, etc, should just be taken care of. That shouldn't even be an issue any more.
      5. Code-hint hover. When you hover over a function name, bring up a window with the first few lines of that function. Maybe open it in a "related code" pane?
      6. Right-click to jump to anything. Right-click a variable to jump to the declaration, or goto other places it is used. Right-click a class name to bring up that class definition.
      7. Start typing out a function, and get a menu of variable-specific functions that can be called. Flash actually does this surprisingly well, or did before CS5.

    19. Re:We've tried this before by Anonymous Coward · · Score: 0

      For the last ten plus years, we have been steadily moving in the direction of more human readable data - the move to XML was supposed to be a huge improvement. It meant you could - sort of - read what was going on at ever level.

      Who is "we"? Not everybody uses US-ASCII as their native character set.

      Nor do mathematicians. Isn't this beautiful code? With the right font it's not ambiguous at all. (With the wrong font there's no distinction between I and l either.) Sorry, can't quote because Slashcode is stuck in the 1980s.

      why, oh tell me why, when I write a simple - trivial - bit of Java code, do I need to write funtions for getters and setters all over the place

      Because it's Java. Try Ruby.

    20. Re:We've tried this before by Anonymous Coward · · Score: 0

      why, when I write a simple - trivial - bit of Java code, do I need to write funtions for getters and setters all over the place - dammit, just declare them as gettable and settable - or (to keep full source code compatibility) the editor could do it.

      You mean like
      right click -> source -> generate getters/setters
      in Eclipse?

    21. Re:We've tried this before by tancque · · Score: 1

      APL was the main language in our introduction course "bio-informatics" at university (early nineties).
      I vaguely remember a keyboard overlay, lots of matrix calculations and lots of complaining why we didnt do a "normal" language like pascal.
      Never used it since...

      --
      Smoke me a kipper, I'll be back for breakfast!
    22. Re:We've tried this before by Yetihehe · · Score: 2, Interesting

      1. Little triangles that hide blocks of code unless you explicitly open and investigate them.

      Netbeans. (view > code folds > collapse all)

      2. Dynamic error detection. Give me a little underline when I write out a variable that hasn't been defined yet. Give a soft red background to lines of code that wouldn't compile. That sort of thing.

      Netbeans.

      3. While we're at-it, "warning" colors. When "=" is used in a conditional, for example, that's an unusual situation that should be underlined in Yellow.

      Netbeans, but not background, it gives you little yellow icons on left side of code and yellow lines near scrollbar (to track errors in whole document).

      4. Hard auto-indent. It may be two spaces in the source code, but accidentally copying the indentation, and putting it in the wrong places, etc, should just be taken care of. That shouldn't even be an issue any more.

      Netbeans. (ctrl+shift+v - paste formatted).

      5. Code-hint hover. When you hover over a function name, bring up a window with the first few lines of that function. Maybe open it in a "related code" pane?

      Netbeans. If you use comments before functions, it will show those.

      6. Right-click to jump to anything. Right-click a variable to jump to the declaration, or goto other places it is used. Right-click a class name to bring up that class definition.

      Netbeans. But with ctrl+click.

      7. Start typing out a function, and get a menu of variable-specific functions that can be called. Flash actually does this surprisingly well, or did before CS5.

      Netbeans. Also flash did this surprisingly bad comparing to netbeans.

      Another nice feature: ctrl+shift+arrow down - copies current line or selection and inserts it lower (+arrow up - inserts it above). It's a surpirisingly good idea, one I miss in many other editors.

      --
      Extreme Programming - Redundant Array of Inexpensive Developers
    23. Re:We've tried this before by Carewolf · · Score: 1

      Not a problem, most modern editors has these functions. Kate/kwrite/kdevelop (same editor in different sheels), has all of those, and many other editors do as well.

    24. Re:We've tried this before by vlm · · Score: 1

      The problem with non-verbal languages is most human beings are only capable of verbal collaboration using verbal-compatible languages.

      Trying to discuss a non-trivial visual program at the water cooler or over the cube wall would "literally" (bad pun, sorry) be impossible. Just imagine how bad it would sound. Bug fixing would be extremely slow.

      Finally you can't grep pictures. No typing / and searching for that interesting function API. It would truely become the first intentionally write only language.

      --
      "Science flies us to the moon. Religion flies us into buildings." - Victor Stenger
    25. Re:We've tried this before by DavidHumus · · Score: 0

      Some of us not only tried, but succeeded. Among the APL systems on which I've worked, one was used by five traders who accounted for 1% of the volume of the NYSE and made a lot of money for the firm. Another was an engineering design system that far surpassed anything that was commercially available for years after it was created in APL.

      Anyone who doesn't know what he's talking about, slam APL - which had features in the 1960s that are still in advance of contemporary languages. I'm sorry that your little brain can't deal with it. It remains a tremendously powerful tool - there are still four or five commercial vendors who sell a version of the language. It's a pity that so many programmers still act as if they are paid by the hour and choose large cumbersome tools when there are so many elegant and powerful ones available - not just APL, but this is the most frequently maligned by ignoramuses.

    26. Re:We've tried this before by reiisi · · Score: 1

      Playing the devil's advocate here, but are you sure that English can be written strictly in ASCII?

      --
      Computer memory is just fancy paper, CPUs just fancy pens with fancy erasers; the 'net is just a fancy backyard fence.
    27. Re:We've tried this before by Viol8 · · Score: 1

      Speaking as someone who's quite happy using vi to do all my coding at home and work I have to say that if the editor is the bottleneck between you and getting a program out perhaps you should consider another career? The hard part should be the coding logic and/or maths, not getting the code text into the machine.

      "why, oh tell me why, when I write a simple - trivial - bit of Java code, do I need to write funtions for getters and setters all over the place"

      You don't. Just get or set the variable directly.

    28. Re:We've tried this before by Dog-Cow · · Score: 1

      The problem with Java is that the language itself as no concept of properties. Per force, the IDE must handle getters and setters for you.

      Both Objective Pascal (the language commonly known as Delphi) and Objective C have a separate concept of properties. This allows an abstraction between the class's data model and its public interface. It also allows the compiler to auto-generate missing getters and setters (where you only write methods to handle special cases, such as data conversions or initiating side effects).

    29. Re:We've tried this before by moonbender · · Score: 1

      Every modern Java IDE will be happy to create the trivial getters/setters for you with zero effort. Or just declare it public, if it really is trivial code. Tiny refactorings have really changed the way I write code. For instance, I rarely declare fields and even local variables manually anymore, it's just easier to just start writing and introduce it later. "Semantic" selection is also nice, using shift+alt+up/down to select increasingly larger (/smaller) expressions.

      Eclipse Mylin is supposed to adjust the IDE views to hide away complexity that's not relevant to the current task (e.g. fold away functions until you view/modify them the first time). Not sure how well it works in practice.

      --
      Switch back to Slashdot's D1 system.
    30. Re:We've tried this before by Anonymous Coward · · Score: 0

      I must point out that the entire canon of English Literature is written in - surprise - English, and that's definitely ascii text.

      Ah, no. ASCII doesn't have a thorn as such, for example. And also note that as originally envisioned ASCII _did_ support accents, through the paradigm. That in fact was a concious design decision (^ lost its stem as a result) and worked fine on paper terminals. Not so much on dumb glass terminals with character generators driven by limited ROM space, and later models didn't really pick up on that feature when they could.

      ASCII in its current usage is a very american thing and doesn't really do even for British English. Certainly not for the stuff older than maybe a century or two. It also doesn't suffice for things like only-show-this-hyphen-at-a-break, non-breaking-spaces, that sort of thing. It offers preciously little to facilitate typesetting.

      There's a lot that could stand improvement, like how most control characters are frankly so much dead meat. Even the simple end-of-line marker isn't that simple, there's two and consequently three major combinations to use them. This has been apparent for a long time, but the answer was to expand the code space and cram every conceivable character and accent and even arbitrary combinations of both into the resulting encoding system. We call it unicode, but it's better called a complete crock and a nightmare.

      Anyway. For the time being, ASCII will do just fine for programming, mostly by virtue of historical accident. Expansion seems nice but isn't, not really. And unicode certainly isn't the answer. But it's the best we have in wide deployment, so we'll continue to use it.

    31. Re:We've tried this before by Anonymous Coward · · Score: 0

      I must point out that the entire canon of English Literature is written in - surprise - English, and that's definitely ascii text.

      No it isn't ASCII, there are a lot of glyphs missing from ASCII that was/is used in English Literature. Perhaps long-s is the most famous one.

      I don't think it has suffered due to lack of expressive capability.

      Obliviously proclaimed by someone that, obviously, only read English. English isn't an expressive language, it is a very inexpressive language. The main reason may not be its alphabet, but please don't claim that English is an expressive language.

    32. Re:We've tried this before by ztransform · · Score: 1

      Would you believe they had 6 page SQL stored procedures?

      I keep coming across multi-page SQL stored procedures too. Horrific, buggy, impossible to maintain. Who are the ???holes that keep getting employment contracts to do this kind of evil? They really need a kick up the backside. And funny how most of them end up in the finance industry...

    33. Re:We've tried this before by Late+Adopter · · Score: 1

      Try Eclipse.

    34. Re:We've tried this before by Jesus_666 · · Score: 1
      Some of your ides are nice bt I do see a few problems with some.

      2. Dynamic error detection. Give me a little underline when I write out a variable that hasn't been defined yet. Give a soft red background to lines of code that wouldn't compile. That sort of thing.

      Of course this would mean that the IDE would be rather busy constantly inspecting your code as yout type. This would most likely make ther IDE less responsive unless you're on a beefy machine with really fast I/O.

      6. Right-click to jump to anything. Right-click a variable to jump to the declaration, or goto other places it is used. Right-click a class name to bring up that class definition.

      No. I happen to like the idea of a contextual menu and having the right mouse button open one is an extremely widespread convention. You could do something like alt+right click, though.

      --
      USE HOT GRITS WITH STATUE OF NATALIE PORTMAN (NAKED AND PETRIFIED)
    35. Re:We've tried this before by eyrieowl · · Score: 1

      try intellij. it's far and away the best java editor. and you can configure what is expanded and collapsed when looking at a file, achieving your goal of hiding everything you're not looking at. All the major ides make inserting setters and getters fairly easy i believe. For IntelliJ, at least, ctrl-i gives you the ability to implement methods, and alt-insert allows you to insert getters/setters/equals/hash/tostring. I think...that the "issues" being discussed are VERY much an issue of editor design at least as much as language design. I think that lots of the "boilerplate" has value, or can have value in some situations. The key is having editors which are in tune with the language and which know how to make the programmer's job easier. I don't believe language design should be handicapped by what best facilitates development in vi. It should be editable in vi or other text editors, but I have no problem with a language designer taking more advanced editing facilities into account when designing their language.

    36. Re:We've tried this before by GigsVT · · Score: 1

      Esperanto is an apt compaison.

      Java was designed to be "pure" and engineered from the ground up to fit some idealistic vision, rather than being a good language, kind of like Esperanto.

      So now you have a semi-interpreted language where hello, world takes 150 megs of RAM, and the promise of being "write once-run anywhere" still hasn't been delivered on.

      --
      I've had enough abrasive sigs. Kittens are cute and fuzzy.
    37. Re:We've tried this before by Anonymous Coward · · Score: 0

      PowerHouse would have been your best friend, but Cognos got bought by IBM, and they're letting the project slide. Damned shame - it was a mighty fast development environment, we wrote some good shit in PowerHouse.

      http://en.wikipedia.org/wiki/Powerhouse_(programming_language)

    38. Re:We've tried this before by DutchUncle · · Score: 1

      Sort of a write-only language.

      I saw Kenneth Iverson speak at RPI in the mid-'70s. He advocated short routines that were disposable or replaceable - write-once code. After all, with the massive power of some of the APL functions, you could do as much in a line as multiple pages of FORTRAN. Whether anyone would ever be able to *understand* it, though . . . .

    39. Re:We've tried this before by badkarmadayaccount · · Score: 1

      How about both?

      --
      I know tobacco is bad for you, so I smoke weed with crack.
    40. Re:We've tried this before by rsdavis9 · · Score: 1

      My second programming language(after basic) back in 1968.
      I have an autographed copy of "A Programming Language" by Ken Iverson.

    41. Re:We've tried this before by SimonInOz · · Score: 1

      Employment insurance?

      --
      "Cats like plain crisps"
    42. Re:We've tried this before by SimonInOz · · Score: 1

      Clearly you have never worked with anybody else's code - had you dealt with the pages and pages of rubbish churned out by underskilled, underpaid workers (not programmers), you might start to agree with me ...

      "Set the variable directly" ... which part of object oriented programming classes did you miss?

      Damn I'm grumpy this morning. Need caffeine.

      --
      "Cats like plain crisps"
    43. Re:We've tried this before by Anonymous Coward · · Score: 0

      Doing something useful in APL isn't the trick. Figuring out what the hell you just did is.

    44. Re:We've tried this before by Anonymous Coward · · Score: 0

      Programs, to me, are a collection of parts - objects, methods, etc, all with internal structure. We seem very poor at further abstracting that - why, oh tell me why, when I write a simple - trivial - bit of Java code, do I need to write funtions for getters and setters all over the place - dammit, just declare them as gettable and settable - or (to keep full source code compatibility) the editor could do it. Simply ,easily, tranparently. And why can't the editor hide everything except what I am concerned with?

      I actually do have what you seek... But I am writing Fortran 2000 with VIm and folding enabled....

  8. They tried this once. by FooAtWFU · · Score: 0, Redundant

    It was called APL. It never really caught on all that well.

    --
    The World Wide Web is dying. Soon, we shall have only the Internet.
  9. If you can't express yourself in ASCII... by MaggieL · · Score: 4, Funny

    ...the character set isn't the problem.

    And I say this as an old APL coder.

    (There aren't many new APL coders.)

    --
    -=Maggie Leber=-
    1. Re:If you can't express yourself in ASCII... by Anonymous Coward · · Score: 1, Insightful

      Iverson thought the character set was a problem. That is why, at the end of his career, he invented 'J' to put APL on a standard keyboard and get past the many issues the custom glyphset creates.

    2. Re:If you can't express yourself in ASCII... by Anonymous Coward · · Score: 0

      APL -offspring of marriage of a scientific calculator and A Programming Language.

      I liked APL, concise, incomprehensible, jargony, self contained universe. The perfect hacker language.

    3. Re:If you can't express yourself in ASCII... by Bigjeff5 · · Score: 1

      And because of that J is now the most popular programming language in the world!

      Or at least one of them, right? No?

      Wait, you mean it didn't catch on like wildfire in the almost 20 years since its inception?

      Huh, wonder why.

      --
      Security is mostly a superstition... Avoiding danger is no safer in the long run than outright exposure. - Helen Keller
  10. Already proposed... on C++ by kikito · · Score: 1

    And more than 10 years ago, in Bjarne Stroustrup's "Generalizing Overloading for C++2000". PDF can be donwloaded here:

    www2.research.att.com/~bs/whitespace98.pdf

    Pages 4-5 delve with this.

    It was also a joke paper. Like I hope this article is.

  11. Examples? by Waccoon · · Score: 1

    So, what are his ideas?

    1. Re:Examples? by 0123456 · · Score: 3, Funny

      So, what are his ideas?

      EBCDIC?

    2. Re:Examples? by izomiac · · Score: 2, Interesting

      From TFA apparently he wants to be able to use (Omega) to name a variable, and ÷ (Division Sign) as an operator. My interpretation of his opinion is that a descriptive name for a variable is inferior to using greek letters, and that using mathematical operators that take an extra five or so keystrokes are superior to the standard +-*/^ set that people have become accustomed to.

      IMHO, if you use more than 26 single letter variables something is seriously wrong, and trying to make mathematical formulas pretty in code isn't practical without a whole lot of unneeded complexity. Sure, having an eight line formula with fractions within fractions and tiny exponent numbers might be (slightly) better than five layers of parenthesis, but you aren't going to get that with just unicode (AFAIK), and the pain of dealing with a slightly misplaced term confounding the unicode to math converter isn't one I'd like to experience. Unicode or even LaTeX code for comments might be useful though.

    3. Re:Examples? by Drishmung · · Score: 1
      ITA2 SHOULD SUFFICE FOR ANYONE STOP

      (Sigh, and why am I not surprised that the ./ filter declined to accept that line on its own?)

      --
      Protoplasm. Quiet Protoplasm. I like quiet protoplasm.
    4. Re:Examples? by N3Roaster · · Score: 1

      I only need two keystrokes to get a division sign (option/) so I'm not really sure what you're going on about there. That said, I don't think it matters so much in terms of writing the code. As others have pointed out, we've been down that road with APL. Where this sort of idea really shines, however, is in reading the code. One of the comments on the article touched on this, so I'll quote the relevant bit:

      Robert Melton | Mon, 01 Nov 2010 02:11:13 UTC
      What an odd combination of criticisms... first and foremost, I think you already hit on the correct solution... custom syntax and creation mechanisms are best explored as layers on top of existing tools, not a fundamental part of a new tool. I believe to try to integrate your ideas would have crippled Go, and given it nearly no advantages, at the cost of a huge degree of developer mind share. Bootstrapping a language is hard enough without giving yourself new disadvantages. I have never seen Guido van Rossum claim anything other the "readability" and that it was the natural flow as foreseen by Knuth (1974)

      The problem here, however, is a cultural one. I suspect that most of the people who write software have never read through a non-trivial program and come out of it with an understanding of the program (contrast this with novelists reading novels) and most software is written in such a way that reading the code for understanding is more like assembling a puzzle from diverse bits scattered all about.

      We already have pretty much all of these presentation niceties with CWEB. I frequently write my programs in literate C++ and can use goofy characters with subscripts if I want. TeX markup in comments is very nice. Especially useful is having something a bit more visually distinctive separating assignment from equality testing, but the big gain here is that it makes it easier to write programs as code narratives which humans can read to gain an understanding of the program. Once you have someone reading the program, how the code is represented starts to matter. Granted, this is another example from the pile of ideas that never really caught on.

      --
      Remember RFC 873!
    5. Re:Examples? by Chrisq · · Score: 1

      From TFA apparently he wants to be able to use (Omega) to name a variable, and ÷ (Division Sign) as an operator. My interpretation of his opinion is that a descriptive name for a variable is inferior to using greek letters, and that using mathematical operators that take an extra five or so keystrokes are superior to the standard +-*/^ set that people have become accustomed to.

      Great, so code reviewers will have to worry whether

      destaccount += credit

      is an a ascii local variable or a global with the Cyrillic Ye character that puts money in some other account

    6. Re:Examples? by Haeleth · · Score: 1

      Hey, you left out the figure shift and the letter shift. And you should really have rung the bell at the end.

      I guess Slashdot doesn't support those characters ...

    7. Re:Examples? by DutchUncle · · Score: 1

      Mock not. Having learned all of my software concepts on IBM mainframes (mainly because that's all there was), I started using a PDP-10 and realized (for example) how much simpler and cleverer it was to have the codes for the letters adjacent and in order. Simplifying the encoding simplifies everything.

  12. It all winds up as binary anyway. by foodnugget · · Score: 4, Funny

    How silly of us to be compiling to binary all this time!
    We've been relegating ourselves to only two different options for decades!

    I reckon that a memory cell and single bit of a processor opcode should have --at least-- 7000 different possibilities. Think of everything a computer could accomplish *then*!

    Seriously, someone tell this guy you're allowed to use more than one character to represent a concept or action, and that these groups of characters represent things rather well.

    1. Re:It all winds up as binary anyway. by hedwards · · Score: 1

      It does, however the comments aren't. I'm not sure how useful this is since you still need to use ASCII characters for programming.

    2. Re:It all winds up as binary anyway. by vlm · · Score: 1

      It does, however the comments aren't. I'm not sure how useful this is since you still need to use ASCII characters for programming.

      The article, and many of the comments, strike me as guys whom have never used a serious hard core preprocessor.

      How bout Lingua::Romana::Perligata?

      http://www.csse.monash.edu.au/~damian/papers/HTML/Perligata.html

      This paper describes a Perl module -- Lingua::Romana::Perligata -- that makes it possible to write Perl programs in Latin. A plausible rationale for wanting to do such a thing is provided

      ... and no, the rationale is not deploying Perl programs in "Latin America".

      I believe I read about this in Perl Journal sometime last century.

      You end up with stuff like this

      sic
                              loco ianitori.
                              dato fonti perlegementum da.
      cis

      There is no particular reason why you couldn't implement the same idea in Kanji instead of Latin.

      --
      "Science flies us to the moon. Religion flies us into buildings." - Victor Stenger
    3. Re:It all winds up as binary anyway. by Anonymous Coward · · Score: 0

      Actually you are not far off as computers will eventually all be analog not digital. Why because of the limitations of binary.

    4. Re:It all winds up as binary anyway. by qc_dk · · Score: 1

      Seriously, someone tell this guy you're allowed to use more than one character to represent a concept or action, and that these groups of characters represent things rather well.

      aaaaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaa a aaaaaaaaaaaaaaaaaa aaaaa aaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaa ... You are so ... right! It's much more efficient. We don't even need more than a single letter.

      It's super that ASCII was made specifically for English and the number of letters therefore appears to be the ideal choice, exactly because you use it to express English. That is not the case for all languages, where the the limitation in letters seems rather artificial.

  13. It ain't broke! by webbiedave · · Score: 5, Insightful

    Let's take our precious time on this planet to fix what's broken, not break what has clearly worked.

  14. vim-cute-python syntax config by Anonymous Coward · · Score: 0

    On serious note, this article reminded me of this project I saw the other day: http://github.com/ehamberg/vim-cute-python. It makes vim show various Unicode characters for Python keywords, such as "alpha" and "not".

    Kinda neat :)

  15. Not only no, by Anonymous Coward · · Score: 4, Funny

    but fuck no.
    I eagerly await comments saying how anglo-centric, racist, bigoted, culturally-imperialist the insistence of using ASCII is.
    The nuanced indignation is salve for my frantic masturbation.
    (If my post is the only one that mentions this, all the better)

    1. Re:Not only no, by sznupi · · Score: 2, Insightful

      Also: Slashdot would never, ever, ever be able to display code snippets of such thing.

      --
      One that hath name thou can not otter
    2. Re:Not only no, by Anonymous Coward · · Score: 0

      but fuck no.
      I eagerly await comments saying how anglo-centric, racist, bigoted, culturally-imperialist the insistence of using ASCII is.
      The nuanced indignation is salve for my frantic masturbation.
      (If my post is the only one that mentions this, all the better)

      ASCII = American Standard Code for Information Interchange

      Would it be anglo-centric, we should be using anglo-saxon runes

      (now, go on wanking).

    3. Re:Not only no, by PitaBred · · Score: 1

      Slashdot is a site for those of the engineering mindeset. For comments like that, you'd have to be looking for one of the sites down the hall, stupidlypoliticallycorrectdot or technicallyincompetentdot

  16. limiting? by Tei · · Score: 2, Insightful

    the chinese have problems to learn his own language, because have all that signs, it make it unncesary complex.

    26 letter lets you write anything, you dont need more letters, really. ask any novelist.

    also, programming languages are something international, and not all keyboards have all keys, even keys like { or } are not on all keyboards, so tryiing to use funny characters like ñ would make programming for some people really hard.

    all in all, this is not a very smart idea , imho

    --

    -Woof woof woof!

    1. Re:limiting? by cblguy2 · · Score: 1

      But sometimes you do need a key to make a Capital letter. ;)

    2. Re:limiting? by Sycraft-fu · · Score: 3, Interesting

      For that matter, we could probably even get away with less letters. Some of them are redundant when you get down to it. What you need are enough letters that you can easily denote all the different sounds that are valid in a language. You don't have to have a dedicated letter for all of them either, it can be through combination (for example the oo in soothe) or through context sensitivity (such as the o in some in context with the e on the end). We could probably knock off a few characters if we tried. If that is worth it or not I don't know but we sure as hell shouldn't be looking at adding MORE.

      Also in terms of programming a big problem is that of ambiguity. Compilers can't handle it, their syntax and grammar is rigidly defined, as it must be. That's the reason we have programming languages rather than simply programming in a natural language: Natural language is too imprecise, a computer cannot parse it. We need a more rigidly defined language.

      Well as applied to unicode programming that means that languages are going to get way more complex if you want to provide an "English" version of C and then a "Chinese" version and a "French" version and so on where the commands, and possibly the grammar, differ slightly. It would get complex probably to the point of impossibility if you then want them to be able to be blended, where you could use different ones in the same function, or maybe on the same line.

    3. Re:limiting? by hedwards · · Score: 1

      That's what I'm wondering about, are there any languages which use unicode for any of the actual language stuff? Because without languages needing the extra unicode characters for actual programming this stuff doesn't appear to really make a difference beyond comments.

    4. Re:limiting? by Anonymous Coward · · Score: 0

      One comes to mind (well, I had to look it up, I'd forgotten the name): http://en.wikipedia.org/wiki/APL_(programming_language)

      Of course, that's probably a major contributing factor to why it died.

    5. Re:limiting? by Jeff+DeMaagd · · Score: 1

      In my opinion, Chinese isn't really so bad, though it understandably looks intimidating to the uninitiated. There are problems, in my opinion the biggest is using a keyboard paradigm designed around Latin languages, but the rest of it is about trade-offs. There are a lot of problems learning English too. Witness how many people take a dozen years of English classes and can't articulate themselves halfway decently. English is an amalgam of three or four languages, plus a ridiculous number of loan words, and then there are all the idioms.

      Anyways, the numerous characters may seem daunting, but there's a method to the madness, it's often possible to derive the meaning and pronunciation of a character based on its sub-glyphs. I don't pretend to have that term right, it's been a while since I covered it. I don't know how they handle the character input into computers though.

    6. Re:limiting? by nameer · · Score: 1

      Twenty six letters, sure. But twenty six glyphs? Far from it. Along with all of the punctuation (the obvious addition) there are ligatures, italics, bold, caps, small caps, etc. Authors use all of these tools to express complex ideas clearly when twenty six letters isn't enough.

      --
      "Uh... yeah, Brain, but where are we going to find rubber pants our size?" --Pinky
    7. Re:limiting? by Anonymous Coward · · Score: 0

      "the actual language stuff" - really shows how much you know about programming.

      TFA is questioning the accepted convention of ASCII in syntax, so of course it results that there are no non-obscure programming languages which use Unicode in its grammar.

      Anyone who thought TFS was discussing comments is an idiot; most compilers already support Unicode comments. Some C++ compilers even support Unicode in variable names. String literal encoding varies per language: not all are strict ASCII.

    8. Re:limiting? by yuje · · Score: 3, Informative
      China has greater than 90% literacy, and the more advanced Chinese speaking societies (Hong Kong, Taiwan, Macau, Singapore) basically have full Chinese literacy. While Japan uses a smaller subset of those characters, the Japanese have full literacy and seemed to have functioned perfectly well while retaining those characters in their writing system. The Chinese people hardly have problems learning, reading, or writing their own language.

      the chinese have problems to learn his own language, because have all that signs, it make it unncesary complex.

      26 letter lets you write anything, you dont need more letters, really. ask any novelist.

      also, programming languages are something international, and not all keyboards have all keys, even keys like { or } are not on all keyboards, so tryiing to use funny characters like ñ would make programming for some people really hard.

      all in all, this is not a very smart idea , imho

      Judging by your post, it appears that you have problems learning your own language. It certainly appears that simple spelling, capitalization, punctuation and correct grammar in the English language are apparently beyond your abilities.

    9. Re:limiting? by Anonymous Coward · · Score: 0

      You mean ask any English speaking writer? Hmm, English.. that's Roman characters and Arabic numbers then. So what's your point, really? Use anything you personally are familiar with, and fuck the rest? Yes, you must be American or a little-englander. Good job you're too stoopid to have a job dealing with real people. QUICK! that burger needs flipping fuckwit.

    10. Re:limiting? by mr_mischief · · Score: 1

      Perl6 has the option, but anything that's Unicode has an ASCII equivalent which is sometimes a digraph or trigraph.

    11. Re:limiting? by AnonymousClown · · Score: 1
      Less letters?

      All you need is A - C - F - K - U.

      Here's a conversation I once heard:

      A: Fucking A!

      B: Fuck?

      A: Fuck!

      B: Fuuuuuuuccccccck!

      A: Fuck.

      --
      RIP America

      July 4, 1776 - September 11, 2001

    12. Re:limiting? by LordLimecat · · Score: 1

      I think Mark Twain found your ideas intriguing, and may have subscribed to your news letter.

    13. Re:limiting? by Dash+Hash · · Score: 1

      Also in terms of programming a big problem is that of ambiguity. Compilers can't handle it, their syntax and grammar is rigidly defined, as it must be. That's the reason we have programming languages rather than simply programming in a natural language: Natural language is too imprecise, a computer cannot parse it. We need a more rigidly defined language.

      Logban disagrees with this, insofar as it is concerned with itself.

      --
      Calling a sword by a pretty name is no more than adding perfume to poison.
    14. Re:limiting? by Anonymous Coward · · Score: 0

      Or: Fuck, the fucking fucker's fucking fucked.

    15. Re:limiting? by Anonymous Coward · · Score: 0

      Whoever came up with the code that DNA uses, needed only four letters. The wide variety of living things that depend on this simple code shows that four letters is enough to express all that incredible complexity.

      Thus, a keyboard with four letters should be enough, shouldn't it?

    16. Re:limiting? by mulvane · · Score: 1
    17. Re:limiting? by Aryarnak · · Score: 1

      Sanskrit is an natural language which is rigidly defined by 4000 rules.

    18. Re:limiting? by rubycodez · · Score: 1

      C is really redundant, and you have an ING in there. might as well add O so we can express negation.

      A: Fuking A!
      B. Fuk?
      A. Fuk!
      B: FuuuuK? Go FuK! No Fuking A!
      A: Fuk!
      B: Fuuuuuking Fuk! FukFukFukFukFukFuk!
      A: Fuk.

    19. Re:limiting? by sznupi · · Score: 1

      All creators of chorded keyboards were up to something after all?

      --
      One that hath name thou can not otter
    20. Re:limiting? by Fareq · · Score: 1

      A Plan for the Improvement of English Spelling

      For example, in Year 1 that useless letter c would be dropped to be replased either by k or s, and likewise x would no longer be part of the alphabet. The only kase in which c would be retained would be the ch formation, which will be dealt with later.

      Year 2 might reform w spelling, so that which and one would take the same konsonant, wile Year 3 might well abolish y replasing it with i and Iear 4 might fiks the g/j anomali wonse and for all.

      Jenerally, then, the improvement would kontinue iear bai iear with Iear 5 doing awai with useless double konsonants, and Iears 6-12 or so modifaiing vowlz and the rimeining voist and unvoist konsonants.

      Bai Iear 15 or sou, it wud fainali bi posibl tu meik ius ov thi ridandant letez c, y and x — bai now jast a memori in the maindz ov ould doderez — tu riplais ch, sh, and th rispektivli.

      Fainali, xen, aafte sam 20 iers ov orxogrefkl riform, wi wud hev a lojikl, kohirnt speling in ius xrewawt xe Ingliy-spiking werld.

      Mark Twain

    21. Re:limiting? by vlueboy · · Score: 1

      also, programming languages are something international, and not all keyboards have all keys, even keys like { or } are not on all keyboards, so tryiing to use funny characters like ñ would make programming for some people really hard.

      Absolutely right and proven every time I mess with alternative language settings in a new Ubuntu install for my not-so-English-ready relatives.

      The first time I made the mistake, I couldn't write shell commands. Even simple one-liners and editing scripts requires ampersands, tildes, colons, hashes, dollar signs and many other ignored symbols. It's pretty tough that a pipe requires a specific dead key. (blue, in the diagram). It's tougher that when I switch back to the US layout, a simple ñ is no longer as easy as Window's trusty ALT-1,6,4, and neither are the accented vowels.

    22. Re:limiting? by scheme · · Score: 1

      Whoever came up with the code that DNA uses, needed only four letters. The wide variety of living things that depend on this simple code shows that four letters is enough to express all that incredible complexity.

      Thus, a keyboard with four letters should be enough, shouldn't it?

      Not true, RNA has uracil and there's epigenetics. Among other things, methylation of the genome and modifications to the histones encode information on what genes are activated and how they get expressed. It turns out that the DNA base pairs are just the beginning of how genes get encoded and expressed.

      --
      "When you sit with a nice girl for two hours, it seems like two minutes. When you sit on a hot stove for two minutes, it
    23. Re:limiting? by Anonymous Coward · · Score: 0

      Might as well add the B as well, since you both seem to want to use it.

    24. Re:limiting? by tokul · · Score: 1

      26 letter lets you write anything, you dont need more letters, really. ask any novelist.

      Only if your "anything" is limited to basic US-English and Swahili.

      We are not talking about real world texts here and limiting programming language to ASCII is good thing. Article tries to introduce real world texts into programming. Not gonna work. ASCII is small character table shared between large group of people. Symbols outside of ASCII are generally limited to some language group and other language group people will have trouble recognizing them.

    25. Re:limiting? by Pentium100 · · Score: 1

      Chinese or Japanese are hard to learn because of the writing system. At least for me, (English is not my native language,though I know it quite well, I know only a few Japanese words), Japanese is quite easy to pronounce, except the R/L thing. However, the writing system is hard. There are a lot of characters to memorize (and they also usually have specific stroke order too). Also, AFAIK, if you don't know the word, you don't know how many symbols it takes and consequently, where the next word begins.

      Compare that to Russian - while the letters are different from those of my native language, there are only 26 or so of them, so I can read a text aloud (slowly) even though I don't understand the words. Now, when I am trying reading a technical text, like a datasheet of a vacuum tube or explanation of some circuit, i can sort-of get what the text is about based on words I already know (including international words) and my general knowledge of electronics and as such, I don't have to type* the whole text into google's translator. Also, Russian text uses spaces, so I can skip a word I don't know and try to deduce its meaning based on the other words.

      *typing is also a problem with huge character sets - I downloaded a keyboard layout that maps most of the Cyrillic letters to the similar sounding Latin letters (well, except those that don't have a Latin equivalent) instead of the normal "Russian typewriter" layout where "F" (which looks like Greek letter phi) is on "A" and "A" (which looks like Latin "A") is on "F". Now, how do I type a kanji symbol when I don't know how it sounds and without looking it up on a paper dictionary?

    26. Re:limiting? by paedobear · · Score: 1

      Quotes of "full literacy" for Japanese/Chinese are pure unadulterated bullshit - if you're willing to drop your standards so hard that China and Japan hit 90% literacy then you may as well claim that the UK, US, Australia, and other English-speaking nations have 100% - or better than 100% - literacy. I can tell you right now that none of the people I work with - highly trained engineers - are actually literate to the levels that the Japanese government expects of 16 year olds.

    27. Re:limiting? by Anonymous Coward · · Score: 0

      We could drop down to two letters; say, 0 and 1...

    28. Re:limiting? by aaribaud · · Score: 1

      For that matter, we could probably even get away with less letters. Some of them are redundant when you get down to it. What you need are enough letters that you can easily denote all the different sounds that are valid in a language. You don't have to have a dedicated letter for all of them either, it can be through combination (for example the oo in soothe) or through context sensitivity (such as the o in some in context with the e on the end). We could probably knock off a few characters if we tried. If that is worth it or not I don't know but we sure as hell shouldn't be looking at adding MORE.

      ... and you'd end up with an alphabet the size of the Roman alphabet, no less. After all, it *is* the result of what you describe (the set of letters being rather constant across all countries using it, but the rules for 'decompressing' back into sound varying for each country, of course).

    29. Re:limiting? by Anonymous Coward · · Score: 0

      Try us-international altgr dead keys layout.
      You get the US layout for programming, and probably all special characters you need are available as a combination with AltGR, as far as I can find at least these (if slahsdot manages to display them): éüúíóöáßðøáæñç
      I use it to write German, Swedish and French (though it doesn't work 100% for French).

    30. Re:limiting? by pjt33 · · Score: 1

      You don't have to have a dedicated letter for all of them either

      That's good, or you'd have to *add* a bunch of letters to English, which uses 5 letters to represent about 20 vowel sounds.

    31. Re:limiting? by Space_Pirate_Arrr · · Score: 1

      Doubleplusgoodthink!

    32. Re:limiting? by Anonymous Coward · · Score: 0

      Chinese characters or whatever you want to call them are actually words with meaning/expression. They're not letters. So please count the words in an English dictionary before you call them unnecessary complex.

    33. Re:limiting? by Anonymous Coward · · Score: 0

      You could be jut as expressive with a 2 character alphabet. Why don't we use that?

    34. Re:limiting? by c0lo · · Score: 1

      For that matter, we could probably even get away with less letters. Some of them are redundant when you get down to it. What you need are enough letters that you can easily denote all the different sounds that are valid in a language.

      But ov kors, zat's a step clos to Euro-English, vat a joy.

      --
      Questions raise, answers kill. Raise questions to stay alive.
    35. Re:limiting? by sznupi · · Score: 1

      Unambiguous and logical is boring anyway; it can have much more interesting lineage.

      --
      One that hath name thou can not otter
    36. Re:limiting? by fnj · · Score: 1

      For that matter, we could probably even get away with less letters. Some of them are redundant when you get down to it.

      A Plan for the Improvement of English Spelling, by Mark Twain:

      For example, in Year 1 that useless letter "c" would be dropped to be replased either by "k" or "s", and likewise "x" would no longer be part of the alphabet. The only kase in which "c" would be retained would be the "ch" formation, which will be dealt with later. Year 2 might reform "w" spelling, so that "which" and "one" would take the same konsonant, wile Year 3 might well abolish "y" replasing it with "i" and Iear 4 might fiks the "g/j" anomali wonse and for all.

      Jenerally, then, the improvement would kontinue iear bai iear with Iear 5 doing awai with useless double konsonants, and Iears 6-12 or so modifaiing vowlz and the rimeining voist and unvoist konsonants. Bai Iear 15 or sou, it wud fainali bi posibl tu meik ius ov thi ridandant letez "c", "y" and "x" -- bai now jast a memori in the maindz ov ould doderez -- tu riplais "ch", "sh", and "th" rispektivli.

      Fainali, xen, aafte sam 20 iers ov orxogrefkl riform, wi wud hev a lojikl, kohirnt speling in ius xrewawt xe Ingliy-spiking werld.

    37. Re:limiting? by Anonymous Coward · · Score: 0

      26 letter lets you write anything, you dont need more letters, really. ask any novelist.

      All you really need are two characters, a zero and a one.

    38. Re:limiting? by pipatron · · Score: 2, Insightful

      Judging by your post, it appears that you have problems learning your own language. It certainly appears that simple spelling, capitalization, punctuation and correct grammar in the English language are apparently beyond your abilities.

      Did it ever occur to you that the person you replied to isn't a native English speaker?

      --
      c++; /* this makes c bigger but returns the old value */
    39. Re:limiting? by Anonymous Coward · · Score: 0

      "...we could probably even get away with less letters."
                      A Plan for the Improvement of English Spelling
                                                          by Mark Twain

                      For example, in Year 1 that useless letter "c" would be dropped
      to be replased either by "k" or "s", and likewise "x" would no longer
      be part of the alphabet. The only kase in which "c" would be retained
      would be the "ch" formation, which will be dealt with later. Year 2
      might reform "w" spelling, so that "which" and "one" would take the
      same konsonant, wile Year 3 might well abolish "y" replasing it with
      "i" and Iear 4 might fiks the "g/j" anomali wonse and for all.
                      Jenerally, then, the improvement would kontinue iear bai iear
      with Iear 5 doing awai with useless double konsonants, and Iears 6-12
      or so modifaiing vowlz and the rimeining voist and unvoist konsonants.
      Bai Iear 15 or sou, it wud fainali bi posibl tu meik ius ov thi
      ridandant letez "c", "y" and "x" -- bai now jast a memori in the maindz
      ov ould doderez -- tu riplais "ch", "sh", and "th" rispektivli.
                      Fainali, xen, aafte sam 20 iers ov orxogrefkl riform, wi wud
      hev a lojikl, kohirnt speling in ius xrewawt xe Ingliy-spiking werld.

    40. Re:limiting? by badkarmadayaccount · · Score: 1

      Runic letter thorn. No need for x.

      --
      I know tobacco is bad for you, so I smoke weed with crack.
  17. Fisher-Price programming? by Anonymous Coward · · Score: 0

    Yes, I want all the keys on my expensive LCD screen keyboard to look like it came straight from Fisher-Price just so I can do some programming.

    Now, where's the :rolleyes: ascii character on this traditional keyboard...

  18. lame by Anonymous Coward · · Score: 0, Offtopic

    This is lame. If you can't program using just the keyboard in front of you, GTFO

  19. This is nonsense by Kohath · · Score: 4, Insightful

    Programming languages usually have too much syntax and too much expressiveness, not too little. We don't need them to be even more cryptic and even more laden with hidden pitfalls for someone who is new, or imperfectly vigilant, or just makes a mistake.

    If anything, programming needs to be less specific. Tell the system what you're trying to do and let the tools write the code and optimize it for your architecture.

    We don't need longer character sets. We don't need more programming languages or more language features. We need more productive tools, software that adapts to multithreaded operation and GPU-like processors, tools that prevent mistakes and security bugs, and ways to express software behavior that are straightforward enough to actually be self-documenting or easily explained fully with short comments.

    Focusing on improving programming languages is rearranging the deck chairs.

    1. Re:This is nonsense by Twinbee · · Score: 2, Interesting

      One day, I think we'll have a universal language that everyone uses (yeah English would suit me, but I don't care as long as whatever language it is, everyone uses it). Efficiency would rocket through the roof, and hence we'll save billions or trillions of pounds.

      In the same way, we'll all be using a single programming language too (even if that language combines more than one paradigm). Yes competition is good in the mean time, but I mean ultimately. It'll be as fast as C or machine code, but as readable as a much higher level language. It won't have baggage such as headers or be unnecessarily verbose either.

      Until that point, we need to do a lot more to improve languages, and it won't just be deckchair arranging.

      --
      Why OpalCalc is the best Windows calc
    2. Re:This is nonsense by Anonymous Coward · · Score: 0

      I thought it was nonsense for a different reason. From TFA:

      Why do we still have to name variables OmegaZero when our computers now know how to render 0x03a9+0x2080 properly?

      Ken Thompson's Plan 9 C compilers -- which are the C compilers distributed with Go -- allow Unicode for variable, function, and preprocessor names. If Poul-Henning Kamp, the author if this nonsense, had used any compilers that Rob Pike had a hand in making, or even bothered to read the papers written about them, he'd know this.

      It's a shame I can't use UTF in a discussion about UTF. Commander Taco, tear down this ASCII wall.

    3. Re:This is nonsense by Anonymous Coward · · Score: 0

      Programming as an interpretative art? That's going to lead to some surprising bugs.

      Going to work even worse than garbage collection. Making computers slower just to make programmers more lazy.

      Want to delete a little something you just used just this once won't need again? "Ok then", says the compiler and emits code to delete the mbr.

    4. Re:This is nonsense by naasking · · Score: 1

      Programming languages usually have too much syntax and too much expressiveness, not too little. We don't need them to be even more cryptic and even more laden with hidden pitfalls for someone who is new, or imperfectly vigilant, or just makes a mistake.

      The point is to move software closer to math, whose syntax is fairly standard and where there are no hidden pitfalls, and perfect vigilance is less important because proof of consistency is either inferred (type inference/proof automation), or provided.

      We don't need longer character sets. We don't need more programming languages or more language features. We need more productive tools

      The need for tools to solve tasks that cannot be solved within the language is a deficiency of the language.

    5. Re:This is nonsense by Anonymous Coward · · Score: 0

      If anything, programming needs to be less specific. Tell the system what you're trying to do and let the tools write the code and optimize it for your architecture.

      We need all the specifics to get our exact meaning across. I, for one, do not want my interpretive computer overlord to make any assumptions on where I wanted "Hello World" printed.

    6. Re:This is nonsense by Kohath · · Score: 1

      Going to work even worse than garbage collection. Making computers slower just to make programmers more lazy.

      It would be faster. Most software poorly utilizes multiple cores and GPU-like architectures. If the language is simple enough, the compiler or the VM can parallelize the object code.

    7. Re:This is nonsense by Kohath · · Score: 1

      We need tools so that mediocre programmers can produce (nearly) bug-free software in a fraction of the time it currently takes expert programmers.

    8. Re:This is nonsense by Kohath · · Score: 1

      We need all the specifics to get our exact meaning across. I, for one, do not want my interpretive computer overlord to make any assumptions on where I wanted "Hello World" printed.

      And that desire to micro-manage everything down to the most minute detail comes at an enormous cost: you are forced to micro-manage everything down to the most minute detail.

      And if you forget a detail, your software has bugs or security holes.

      And, as systems get more complex, the number of details increases faster than you can handle them.

      And, as time passes, the details change. And now, with 12-core CPUs and GPU-like chips, the details have changed beyond the intended design of conventional languages.

      The way to solve problem like this, where the tasks outpace ordinary human abilities, is either

      A: Get a huge, super-expensive staff or
      B: automation. Let the software do the grunt work. You tell it where to go, when to arrive, and what to bring. It takes care of moving the feet, reading the map, and picking the route.

    9. Re:This is nonsense by ObsessiveMathsFreak · · Score: 1

      Programming languages usually have too much syntax and too much expressiveness, not too little.

      I take it you're a Python man!

      --
      May the Maths Be with you!
  20. Program vs. Literature by oldhack · · Score: 1

    We like economy and precision in programming languages. You may have many complaints about English, but it's pretty damn good common language due to its slutty tendency - it soaks in whatever useful from other languages.

    In general, I don't want poetry in coding. I definitely don't want Egyptian glyphs or Chinese ideograms.

    --
    Fuck systemd. Fuck Redhat. Fuck Soylent, too. Wait, scratch the last one.
    1. Re:Program vs. Literature by ScrewMaster · · Score: 1

      You may have many complaints about English, but it's pretty damn good common language due to its slutty tendency

      I'm guessing that programming in French would not be among your short list.

      --
      The higher the technology, the sharper that two-edged sword.
  21. Two words: Perl 6 by Etcetera · · Score: 1

    I like how he mentions perl, but completed neglects to mention Perl6.

    One of the most derided or most lauded features (depending on your POV) in perl6 is the copious use of additional syntax operators in the interests of further Huffman coding. There are certain operators (for example, the "hyper" operators that are defined in terms of unicode symbols ("") and use ASCII digraphs as an alternate form (">>").

    So, it's there now in a mostly stable form... you can program in unicode-laced form all you like at this point.

    1. Re:Two words: Perl 6 by rrohbeck · · Score: 1

      Nah, the ASCII tokens are much better, like the cheerleader-plus "*+*".

  22. No we don't by Sycraft-fu · · Score: 4, Informative

    Because I don't want to have to own a 2000 key keyboard, or alternatively learn a shitload of special key combos to produce all sorts of symbols. The usefulness of ASCII, and just of the English/Germanic/Latin character set and Arabic numerals in general is that it is fairly small. You don't need many individual glyphs to represent what you are talking about. A normal 101 key keyboard is enough to type it out and have enough extra keys for controls that we need.

    To see the real absurdity of it, apply the same logic to the numerals of the character set. Let's stop using Arabic numerals, let's use something more. Let's have special symbols to denote commonly used values (like 20, 25, 100, 1000). Let's have different number sets for different bases so that a 3 can be told what base its in just by the way it looks! ...

    Or maybe not. Maybe we should stick with the Arabic numerals. There's a reason they are so widely used: The Indians/Arabs got it right. It is simple, direct, and we can represent any number we need easily. Combining them with simple character indicators like H to indicate hex works just fine for base as well.

    You might notice that even languages that don't use the English/ASCII character set tend to use keyboards that use it. Japanese and Chinese enter transliterated expressions that the computer then interprets as glyphs. Doesn't have to be that way, they could different keyboards, some of them rather large depending on the character set being used, but they don't. It is easy and convenient to just use the smaller, widely used, character set.

    Now none of this means that you can't use Unicode in code, that strings can't be stored using it, that programs can't display it. Indeed most programs these days can handle it, just fine. However to start coding in it? To try and design languages to interpret it? To make things more complex for their own sake? Why?

    I am just trying to figure out what he thinks would be gained here. Also remembering that the programming languages, the compilers, would need to be changed at the low level. Compilers do not take ambiguity, if a command is going to change from a string of ASCII characters to a single unicode one, that has to be changed in the compiler, made clear in the language specs and so on.

    1. Re:No we don't by ScrewMaster · · Score: 1

      Because I don't want to have to own a 2000 key keyboard, or alternatively learn a shitload of special key combos to produce all sorts of symbols. The usefulness of ASCII, and just of the English/Germanic/Latin character set and Arabic numerals in general is that it is fairly small. You don't need many individual glyphs to represent what you are talking about. A normal 101 key keyboard is enough to type it out and have enough extra keys for controls that we need.

      The reality is that it works well enough for so many things that changing it would be gratuitous, and would rather quickly reach the point of diminishing returns. Why do our Asian friends find themselves simplifying many of their languages in the post-industrial, global-economic, Internet-driven world? It's because some arbitrarily high degree of expressiveness is a drawback in ordinary business and technical discourse.

      --
      The higher the technology, the sharper that two-edged sword.
  23. APL! by thisisauniqueid · · Score: 1

    Let's go back to APL!

  24. ASCII art is cool! by Joe+The+Dragon · · Score: 4, Insightful

    ASCII art is cool!

    1. Re:ASCII art is cool! by Andy+Smith · · Score: 1

      Citation?

    2. Re:ASCII art is cool! by Anonymous Coward · · Score: 0

      needed

    3. Re:ASCII art is cool! by znerk · · Score: 1

      Here's yer citation. Sorry you're too lazy to use a search engine.

      --
      This work is licensed under a Creative Commons Attribution 3.0 Unported License.
  25. Unicode symbols in Code??? by zAPPzAPP · · Score: 1

    I don't get it.
    When coding, I already am annoyed by the placement of on my keyboard, on a key that I don't reach easily (good thing i don't do html, hehe). Using lots of symbols, that require me to do a two-key combination, slow me down.
    Now I'm supposed to use Unicode? Is that guy insane?
    How am I supposed to type out unicode expressions on my keyboard, without typing in the whole 4 digit number?
    And if I want to address a unicode-named variable, but I forgot the magical number to make it appear.. then what? Copy paste?

    Must be a joke then, right.

    1. Re:Unicode symbols in Code??? by mfnickster · · Score: 1

      Using lots of symbols, that require me to do a two-key combination, slow me down.
      Now I'm supposed to use Unicode? Is that guy insane?
      How am I supposed to type out unicode expressions on my keyboard, without typing in the whole 4 digit number?

      The obvious solution, if an ugly one, is to use escape sequences or symbolic entities like in TeX.

      If you want to use theta as a variable, type something like "\theta = 20" etc. Having a symbolic name for each Unicode character, expressible in ASCII, would be prerequisite.

      A pain in the ass? Maybe, but the pluses it offers may make it worthwhile.

      --
      "Slow down, Cowboy! It has been 3 years, 7 months and 26 days since you last successfully posted a comment."
    2. Re:Unicode symbols in Code??? by arose · · Score: 1

      How am I supposed to type out unicode expressions on my keyboard, without typing in the whole 4 digit number?

      No, just like you don't have to enter a bunch of whitespace now, if you use an editor that has some awareness of the language you use. You could, for example, enter '=>' for an arrow '==' for 224D or 2261, pi for it's symbol, etc. and let the editor deal with the magic. It wouldn't affect input as much as readability (IMHO, in a positive way).

      And if I want to address a unicode-named variable, but I forgot the magical number to make it appear.. then what? Copy paste?

      Well, you could always disallow it in variable names, after all, most ASCII symbols aren't legal in the variable names in most languages.

      --
      Analogies don't equal equalities, they are merely somewhat analogous.
    3. Re:Unicode symbols in Code??? by Anonymous Coward · · Score: 0

      > No, just like you don't have to enter a bunch of whitespace now, if you use an editor that has some awareness of the language you use. You could, for example, enter '=>' for an
      > arrow '==' for 224D or 2261, pi for it's symbol, etc. and let the editor deal with the magic. It wouldn't affect input as much as readability (IMHO, in a positive way).

      And what, pray tell, is the point of changing the programming language for that? Then everyone _has_ to use a special editor.
      You can just use an editor that shows != or >= as special symbols and it works fine with C as it is.
      You can even make it replace omega, *_omega, omega_* or *_omega_* by the appropriate symbol.

  26. What about Sun's Fortress language by philgross · · Score: 4, Informative

    Sun's Fortress language allowed you to use real, LaTeX-formatted math as source code. They reasoned, correctly I think, that for the mathematically literate, this would make the programs far clearer. Google for Fortress Programming Language Tutorial.

    1. Re:What about Sun's Fortress language by Anonymous Coward · · Score: 0

      And LaTeX is written in plain ASCII... ;-)

  27. The article's author's concerns are misdirected by Anonymous Coward · · Score: 0

    While I agree that compatibility with ASR-33 should be tossed to the side, replacing ASCII alone isn't going to solve this problem. The article argues that language developers have had to squeeze reliable syntax out of a small character set, but this is a result of the problem, not the cause of it. Extensibility is the key. Where we are trapped is in syntax definition. As mentioned with C/C++ being unable to define custom operators. If a problem need be solved here (which, IMHO this isn't really a problem), then its solution is making every keyword and type user extensible. However, doing this sort of thing can and would have major repercussions across the business world. When types and basic math become a matter of contention things can get ugly really quick. We'd spend the first 5 years hoping that the market would pan out an interface library of common custom types.

    But I digress. Sane localization of syntax via unicode isn't too horrible an idea. 1 to 1 translation of words should be fairly easy to implement without loss of meaning when re-localized. However, development is about logic, not necessarily math. While mathematics does define a whole slew of operators we don't have the option of typing in 1 character, typing/reading their names works well for logic.


    Full Disclosure: While I develop in many languages my day to day development is done in Visual Studio, and I'm therefore one of those bastards that's at least a bit spoiled by Intellisense.

  28. Fortress allows Unicode, but has ASCII equivalent by thisisauniqueid · · Score: 3, Interesting

    Fortress allows you to code in UTF-8. However it has a multi-char ASCII equivalent for every Unicode mathematical symbol that you can use, so there is a bijective map between the Unicode and ASCII versions of the source, and you can view/edit in either. That is the only acceptable way to advocate using Unicode anywhere in programming source other than string constants. Programming languages that use ASCII have done well over those that don't, for the same reason that Unicode has done well over binary formats.

  29. I don't see the problem and the problem is solved by davidwr · · Score: 1

    Sure, strings and other items that can be seen on the screen would benefit from an expanded character set, but otherwise, why bother?

    The only advantage I can think of is so that variable names, function names, and other user-defined non-display values can be in languages other than English or other Latin-letter languages. However, as English is currently the lingua franca of the technology world, encouraging fragmentation in this area is not a good idea.

    Besides, nothing stops you from writing your code in Chinese or whatever other Unicode character set you want and using a preprocessor to convert it into ASCII before it hits the compiler or interpreter. The only "gotcha" is that there isn't a standardized way of doing the conversion, which can make it hard to link to binary libraries unless you use the same pre-processor.

    --
    Knowledge is how to play a game, intelligence is how to win, wisdom is knowing what game to play.
  30. Re:Fortress allows Unicode, but has ASCII equivale by thisisauniqueid · · Score: 1

    Sorry, meant to say "for the same reason XML [not Unicode] has done well over binary formats".

  31. Haskell by kshade · · Score: 3, Interesting
    Haskell supports various unicode characters as operators and it makes me wanna to puke. http://hackage.haskell.org/trac/haskell-prime/wiki/UnicodeInHaskellSource IMO one of the great things about programming nowadays is that you can use descriptive names without feeling bad. Single character identifiers from different alphabets are something that rub me the wrong way in mathematics. Keep 'em out of my programming languages!

    Bullshit from the article:

    Unicode has the entire gamut of Greek letters, mathematical and technical symbols, brackets, brockets, sprockets, and weird and wonderful glyphs such as "Dentistry symbol light down and horizontal with wave" (0x23c7). Why do we still have to name variables OmegaZero when our computers now know how to render 0x03a9+0x2080 properly?

    OmegaZero is at least something everybody will recognize. And why would you name a variable like that anyway? It's programming, not math, use descriptive names.

    But programs are still decisively vertical, to the point of being horizontally challenged. Why can't we pull minor scopes and subroutines out in that right-hand space and thus make them supportive to the understanding of the main body of code?

    Because we're not using the same IDE?

    And need I remind anybody that you cannot buy a monochrome screen anymore? Syntax-coloring editors are the default. Why not make color part of the syntax? Why not tell the compiler about protected code regions by putting them on a framed light gray background? Or provide hints about likely and unlikely code paths with a green or red background tint?

    ... what?

    For some reason computer people are so conservative that we still find it more uncompromisingly important for our source code to be compatible with a Teletype ASR-33 terminal and its 1963-vintage ASCII table than it is for us to be able to express our intentions clearly.

    ... WHAT? If you don't express your intentions clearly in a program it won't work!

    And, yes, me too: I wrote this in vi(1), which is why the article does not have all the fancy Unicode glyphs in the first place.

    vim does Unicode just fine. And from the Wikipedia entry on the author (http://en.wikipedia.org/wiki/Poul-Henning_Kamp):

    A post by Poul-Henning is responsible for the widespread use of the term bikeshed colour to describe contentious but otherwise meaningless technical debates over trivialities in open source projects.

    Irony? Why does this guy come off as an idiot who got annoyed by VB in this article when he clearly should know better?

    1. Re:Haskell by Frater+219 · · Score: 1

      If your language actually uses the character U+23C7 "DENTISTRY SYMBOL LIGHT DOWN AND HORIZONTAL WITH WAVE" as an operator, your editor will let you type it with a simple keyboard combination, like Compose-T-~. If you're using U.S. Windows and have to resort to Alt+numbers to type things, you're silly.

    2. Re:Haskell by Anonymous Coward · · Score: 0

      > It's programming, not math,

      Um... That's the same.

    3. Re:Haskell by Anonymous Coward · · Score: 0

      ...It's programming, not math, use descriptive names.

      It appears you are not familiar with Haskell or [this][http://en.wikipedia.org/wiki/Curry%E2%80%93Howard_correspondence].

    4. Re:Haskell by kshade · · Score: 1

      It appears you are not familiar with Haskell or [this][http://en.wikipedia.org/wiki/Curry%E2%80%93Howard_correspondence].

      I'm more familiar with Haskell than I'd like to be, thank you. What I meant is that programmers shouldn't use the horrible syntax you use in math for programming, e.G. no "funny" characters, descriptive names and all that.

  32. It's Halloween ... by Anonymous Coward · · Score: 0

    you're just trying to scare us ... right? ... right?

    I really can't think of a lot of coding that would usefully be done with a more 'expressive' character set. The output of the code often has to be expressive but that isn't the same.

    The most popular programming languages are Java, C, C++ http://langpop.com/ They aren't popular because they are easy to use. They are used because they are effective. The innovative languages are well down the list.

    You can read many reasons why the more innovative languages are better; in theory. C is either the most popular or second most popular language. There's a reason for that. Theory be damned.

    1. Re:It's Halloween ... by Anonymous Coward · · Score: 0

      The language used to build your body as well as the language of all living things has only four letters. It was used to construct your brain which is now programming computers. I would say that whoever came up with the programming language for DNA did so because it is pretty effective.

  33. Perl 6 by jepaton · · Score: 1

    Perl 6 has guillemets in its standard syntax (equivalent to "<<" and ">>"). These are non-ASCII symbols. It will also be possible to declare new operators using whatever character you want (e.g. a snowman operator, see: http://perl6advent.wordpress.com/2009/12/17/day-17-making-snowmen/).

    1. Re:Perl 6 by russotto · · Score: 2, Interesting

      Sure, but Perl is often derided as a "write only language", and Perl 6 is simply continuing the tradition.

  34. Re:Two words: Perl 5 by Krishnoid · · Score: 1

    Perl 5.8 and above have native Unicode string and I/O support, per the first chapter of the most current rev of the Perl Cookbook, and you can use utf8 as well to write your scripts in Unicode.

  35. On a related note... by Anonymous Coward · · Score: 0

    We should all use trinary systems instead of binary!

  36. Idiocracy Hospital Keyboard by theodp · · Score: 2, Interesting
  37. Wingdings of Disease by theodp · · Score: 2, Funny
  38. two key keyboard by pyrocam · · Score: 1

    all you really need is two keys, 1 and 0

    1. Re:two key keyboard by Anonymous Coward · · Score: 0

      I find "space" and "input" to be helpful as well.

  39. Author seems to be high or something by Tridus · · Score: 5, Insightful
    He comes up with a bunch of ideas at the end that are out to lunch. Let's take a look:

    Unicode has the entire gamut of Greek letters, mathematical and technical symbols, brackets, brockets, sprockets, and weird and wonderful glyphs such as "Dentistry symbol light down and horizontal with wave" (0x23c7). Why do we still have to name variables OmegaZero when our computers now know how to render 0x03a9+0x2080 properly?

    Well, let's think. Possibly because nobody knows what 0x03a9+0x2080 does without looking it up, and nobody seeing the character it produces would know how to type said character again without looking it up? I know consulting a wall-sized "how to type X" chart is the first thing I want to do every 3 lines of code.

    While we are at it, have you noticed that screens are getting wider and wider these days, and that today's text processing programs have absolutely no problem with multiple columns, insert displays, and hanging enclosures being placed in that space? But programs are still decisively vertical, to the point of being horizontally challenged. Why can't we pull minor scopes and subroutines out in that right-hand space and thus make them supportive to the understanding of the main body of code?

    If you actually look at word processing programs, the document is also highly vertical. The horizontal stuff is stuff like notes, comments, revisions, and so on. Putting source code comments on the side might be a useful idea, but putting the code over there won't be unless the goal is to make it harder to read. (That said, widescreen monitors suck for programming.)

    And need I remind anybody that you cannot buy a monochrome screen anymore? Syntax-coloring editors are the default. Why not make color part of the syntax? Why not tell the compiler about protected code regions by putting them on a framed light gray background? Or provide hints about likely and unlikely code paths with a green or red background tint?

    So anybody who has some color-blindness (which is not a small number) can't understand your program? Or maybe we should make a red + do something different then a blue +? That's great once you do it six times, then it's just a mess. (Now if you want to have the code editor put protected regions on a framed light gray background, sure. But there's nothing wrong with sticking "protected" in front of it to define what it is.) It seems like he's trying to solve a problem that doesn't really exist by doing something that's a whole lot worse.

    --
    -- "So they told me that using the download page to download something was not something they anticipated." - Bill Gates
    1. Re:Author seems to be high or something by OrangeTide · · Score: 1

      I find that syntax coloring to cause my tremendous eyestrain. Likely due to my prescription being so strong that I run into the refractive index of my lens causing blue to be rendered in double vision. A computer programmer that wears thick glasses, I know it sounds pretty unusual.

      --
      “Common sense is not so common.” — Voltaire
    2. Re:Author seems to be high or something by sjames · · Score: 1

      Not to mention nobody will even be able to guess at how to pronounce the symbols. I can just imagine the conversations now:

      Can you look at this, it keeps crashing right around here. OK, lets see...What is that called? Which one? The one that looks kinda like a constipated duck. Hrmm, don't know. Well, anyway, moving along, constipated duck = surprised platypus times ruptured squirrel plus empire state building divided by yacking eagle...hrmm, can yacking eagle ever equal 0? I don't know, search for it. How do you type that one? I don't know, look it up. HOW?!? I'm sure it's not actually listed under yacking eagle! Screw it, lets just rewrite this steaming pile. Hey, I know that one....

      Why stop with funny symbols, perhaps we can express our programs as finger painting or interpretive dance?

    3. Re:Author seems to be high or something by Anonymous Coward · · Score: 0

      It seems like he's trying to solve a problem that doesn't really exist by doing something that's a whole lot worse.

      Got to disagree with that. Most of TFA is rubbish, but the code highlighting and horizontal re-arrangement sound like a change toward visual programming and drag and drop syntactical association. If the source file wasn't an ASCII file but was program 'content' which could be interpreted by the IDE to display something meaningful containing visual cues for program flow and object association .. what would be a mess with that?

      Sounds like a nicely rational way to start moving from writing code to creating code. Graphics designers have graphics packages, why do programmers use glorified text editors instead of development packages? I doubt there's a movie studio in the world where the artists cut the meshes using vi. Similarly I doubt there's a code shop in the world where the programmers work with anything that doesn't closely resemble vi.

    4. Re:Author seems to be high or something by Pentium100 · · Score: 1

      I think that we should use the available colors. Make a "+" that has color (128,128,128) do something different than "+" with color (128,128,129).

      Also, as I program in Delphi (and am not a good programmer at that - but I can write simple tools for myself) - case sensitive variable and funtion names in C++ seem weird to me - it's harder to write, because you need to remember the case. Or do C/C++/PHP programmers write something like this:

      VAR = var *vaR + vAr/(vAR+Var)-VAr; // where each of them is a different variable?

      Same thing about file names in file systems used by Linux.

    5. Re:Author seems to be high or something by Anonymous Coward · · Score: 0

      That said, widescreen monitors suck for programming.

      Well not that bad if you use something like split view or two editors side-by-side. One as a reference, one where you implement. That said, being a poor student I haven't experienced what's like to do programming with a big monitor (or two or more big monitors, for that matter).

    6. Re:Author seems to be high or something by Kjella · · Score: 1

      Well, let's think. Possibly because nobody knows what 0x03a9+0x2080 does without looking it up, and nobody seeing the character it produces would know how to type said character again without looking it up? I know consulting a wall-sized "how to type X" chart is the first thing I want to do every 3 lines of code.

      Now you're being a bit disingenious, the point would be to write functions like you do in math. You would not see "0x03a9+0x2080", you would see the actual omega sign with a subscript zero. The point would be to not create "computerfied" variables like "int delta_v = 0" but rather be equal to print sources.

      If you actually look at word processing programs, the document is also highly vertical. The horizontal stuff is stuff like notes, comments, revisions, and so on. Putting source code comments on the side might be a useful idea, but putting the code over there won't be unless the goal is to make it harder to read. (That said, widescreen monitors suck for programming.)

      Hmm having a window that would show the subroutine whenever I put my cursor on one, without leaving the main code and flow that I'm working on doesn't sound like such a bad idea. Certainly worth trying.

      But there's nothing wrong with sticking "protected" in front of it to define what it is.)

      I agree this really doesn't make sense, you'd have to make the color part of the markup language. Who needs a <font color="#ff0000" /> in their code? Maybe if you have some intelligent code to color two way mapping that could hide the code, but that would just be when on display.

      --
      Live today, because you never know what tomorrow brings
    7. Re:Author seems to be high or something by Ecuador · · Score: 1

      You are right in most of your points and TFA is idiotic, however:

      (That said, widescreen monitors suck for programming.)

      Eh, no, you are just using them incorrectly. Rotate 90 degrees and try again.

      --
      Violence is the last refuge of the incompetent. Polar Scope Align for iOS
    8. Re:Author seems to be high or something by vlm · · Score: 1

      Not to mention nobody will even be able to guess at how to pronounce the symbols. I can just imagine the conversations now:

      Can you look at this, it keeps crashing right around here. OK, lets see...What is that called? Which one? The one that looks kinda like a constipated duck. Hrmm, don't know. Well, anyway, moving along, constipated duck = surprised platypus times ruptured squirrel plus empire state building divided by yacking eagle...hrmm, can yacking eagle ever equal 0? I don't know, search for it. How do you type that one? I don't know, look it up. HOW?!? I'm sure it's not actually listed under yacking eagle! Screw it, lets just rewrite this steaming pile. Hey, I know that one....

      Why stop with funny symbols, perhaps we can express our programs as finger painting or interpretive dance?

      Users have had to put up with this kind of BS from GUI designers for decades now. Oh, you say mail-merge is the icon that looks like mating centipedes? Oh obviously. And nothing intuitively says "internet browser" like orange comet wrapping itself around an Arrakis/Dune blue eyeball. Then there's the retro stuff like icons of 5.25 inch diskettes, where my kids ask what the heck that is.

      --
      "Science flies us to the moon. Religion flies us into buildings." - Victor Stenger
    9. Re:Author seems to be high or something by Anonymous Coward · · Score: 0

      \Omega_0. Unicode seems like a huge waste without a standard easy way to input it. But it would be nice to use math notation sometimes, and I suppose that French-speaking programmers, for example, would like to use correct accents in variable names, if they can't already.

  40. Artifacts by H3xx · · Score: 0

    I'm also wondering why we insist that:

    • our source code be able to be wrapped to 78 characters
    • a tab (0x09) character is equal to 8 spaces (unless you specify otherwise)
    • most major programming languages have function names that are in US English, even Ruby, which was developed by the Japanese, and Scilab's programming language which was developed by French scientists
    • POSIX regular expressions' [:alphanum:] character class is most often written as [A-z0-9]

    The truth is that we programmers prefer to be able to type things quickly without having to memorize character codes for a variety of Unicode characters; we want to be able to type simple variable and function names using a standard set of glyphs and not have to worry about remembering which variation of a Chinese pictograph was used.

    If it comes down to it, we could all just use Ook and not worry about language barriers (or getting much of anything done for that matter).

    --
    "Ubuntu" - an African word meaning "Slackware is too hard for me."
  41. Ok, let's go Unicode then! by Exitar · · Score: 1

    Now, what kind of marvelous and innovative language the author of the article will propose?

  42. how about a character solely for escaping by lulalala · · Score: 1

    though this is just a programmer's dream, I always wished that we have a character solely for the purpose of escaping other characters. This will have a few benefits:
    1. You won't need to escape this escape-character.
    2. makes it easier for different languages to use the same way to escape stuffs. I won't need to worry about this string that gets escaped in SQL, ASP then JavaScript.
    3. Having a new escaping character shouldn't impact the old code. It just gives the user another option.

    1. Re:how about a character solely for escaping by rubycodez · · Score: 1

      too often a program needs to feed another program. how are we going to escape your escape character to get it into the output that will be the input of another program that needs it?

      we'll escape it with another of itself? haha, we're back to what we have now....

    2. Re:how about a character solely for escaping by lulalala · · Score: 1

      (was going to show an example but couldn't figure out how to enter an Unicode character in \.)
      Well if this escaping character is displayed but is never printed, there is no need to escape itself.
      And if two languages implement the same logic to treat this escaping character, there is no need to re-escape it.
      At the same time, they can still use the old \n \t too.

    3. Re:how about a character solely for escaping by multipartmixed · · Score: 1

      It's called the compose key, and it's been on proper keyboard (hint: not PC keyboards) for over twenty years.

      --

      Do daemons dream of electric sleep()?
  43. Microsoft Visual Studio allows Unicode identifiers by Myria · · Score: 1

    Microsoft Visual C++ and C# allow Unicode identifiers; that is, variable and function names. Visual C++ allows this:

    int meow()
    {
        int áéíóú = 1;
        return áéíóú;
    }

    --
    "Screw Sun, cross-platform will never work. Let's move on and steal the Java language." - Visual J++ Product Manager
  44. Would it be less tedious to have 10,000+ keys? by Sycraft-fu · · Score: 5, Insightful

    Because that's what you find in JIS X 0213:2000. Even if you simplify it to just what is needed for basic literacy, you are talking 2000 characters. If you have that many characters your choices are either a lot of keys, a lot of modifier keys, or some kind of transliteration which is what it done now. There is just no way around this. You cannot have a language that is composed of a ton of glyphs but yet also have some extremely simple, small, entry system.

    You can have a simple system with few characters, like we do now, but you have to enter multiple ones to specify the glyph you want. You could have a direct entry system where one keypress is one glyph, but you'd need a massive amount of keys. You could have a system with a small number of keys and a ton of modifier keys, but then you have to remember what modifier, or modifier combination, gives what. There is no easy, small, direct system, there cannot be.

    Also, is it any more tedious than any Latin/Germanic language that only uses a small character set? While you may enter more characters than final glyphs, do you enter more characters than you would to express the same idea in French or English?

    1. Re:Would it be less tedious to have 10,000+ keys? by SnapShot · · Score: 2, Informative

      When it was first announced (5 years ago now?), I thought the Optimus Maximus keyboard was going to solve this problem. With a little smarts built into the keyboard I wouldn't mind esoteric key combinations if the result was displayed directly on the keyboard. Something like this might, someday, be the solution but at $1500 dollars it's going to be a while and assuming a direct-brain interface doesn't come first.

      --
      Waltz, nymph, for quick jigs vex Bud.
    2. Re:Would it be less tedious to have 10,000+ keys? by MichaelSmith · · Score: 2, Interesting

      But few people really look at keyboards. Our fingers know where the button will be. I don't want to hunt and peck for special characters.

    3. Re:Would it be less tedious to have 10,000+ keys? by siride · · Score: 4, Funny

      Why bother? We already have machines that are good at that: two year olds. Two year olds aren't good at doing trend analysis on a million data points, which is why we have computers. We'd gain pretty much nothing from making a silicon-based two year old. It'd probably be just as slow and would cost considerably more than a two year old.

    4. Re:Would it be less tedious to have 10,000+ keys? by BrokenHalo · · Score: 1

      It would probably be quieter and less messy, though.

    5. Re:Would it be less tedious to have 10,000+ keys? by The+Mighty+Buzzard · · Score: 3, Insightful

      Hunt and peck? I don't even want to have to remember that many glyphs exist, much less where to find them. If it can't be expressed with a standard qwerty keyboard and one (shift) modifier key, it's too fucking complicated to bother with as general text entry.

      --
      Violence is like duct tape. If it doesn't solve the problem, you didn't use enough.
    6. Re:Would it be less tedious to have 10,000+ keys? by Anonymous Coward · · Score: 0

      "It'd probably be just as slow and would cost considerably more than a two year old."

      And not be nearly as much fun to create....

    7. Re:Would it be less tedious to have 10,000+ keys? by mrchaotica · · Score: 1

      You cannot have a language that is composed of a ton of glyphs but yet also have some extremely simple, small, entry system.

      A stylus and character recognition is pretty small and simple (for the user; not the person who developed the character recognizer). I wouldn't want to use one for programming, though.

      --

      "[Regarding the 'cloud,'] ownership was what made America different than Russia." -- Woz

    8. Re:Would it be less tedious to have 10,000+ keys? by Anonymous Coward · · Score: 0

      Obviously you only need 10 keys and two foot pedals to type 4096 different characters.

    9. Re:Would it be less tedious to have 10,000+ keys? by RajivSLK · · Score: 1

      "There is just no way around this. You cannot have a language that is composed of a ton of glyphs but yet also have some extremely simple, small, entry system."

      Really? How about something new crazy like a pen?

    10. Re:Would it be less tedious to have 10,000+ keys? by the_womble · · Score: 3, Funny

      You do not have kids do you? I can assure you that the cumulative cost of a two year old, starting from the first pre-natal medical costs, including lost work and productivity, food, drink, accommodation, etc. is considerable.

      Unlike computers, kids get more expensive every year, and there are laws about getting them to do useful work.

    11. Re:Would it be less tedious to have 10,000+ keys? by gknoy · · Score: 3, Informative

      It'll be interesting when you go to write some Perl code with your pen+tablet. The text recognition assumes you're writing in a natural language, so braces and punctuation are often tedious to get right. Write some basic Perl (with hashes, arrays, and some scalars) on your local handwriting-recognizing device, and let us know how amusing it is.

    12. Re:Would it be less tedious to have 10,000+ keys? by Anonymous Coward · · Score: 0

      There is no easy, small, direct system, there cannot be.

      Paintbrush? I'd like to see the million column paint brushed airline control system in Kanji. Way to claim programming as art!

    13. Re:Would it be less tedious to have 10,000+ keys? by h4rm0ny · · Score: 1

      When you have a need to enter accented characters, you also get in the habit of using the Alt Gr key pretty quickly. That also is acceptable. But something with more keys than the QWERTY keyboard is going to become bad, fast. Chorded keyboards shouldn't be much harder than a normal keyboard to master and would pay dividends, but that's nothing to do with the question of what glyphs to use in a programming language. I think the author of TFA has lost it. Sticking some obscure Unicode characters in the language isn't going to help any competent programmer and is just going to introduce some rare and bizarre bugs that will confuse the Hell out of you. Now being able to enter a language in full Unicode is / would be good, i.e. no tedious escaping the characters in string literals - just type them in, but that's not what the author is going on about.

      --

      Aide-toi, le Ciel t'aidera - Jeanne D'Arc.
    14. Re:Would it be less tedious to have 10,000+ keys? by pjt33 · · Score: 1

      It's easy to spot that you type in English. Those of us who type in other languages which use the Latin alphabet tend to need Alt-Gr and combining diacritics.

    15. Re:Would it be less tedious to have 10,000+ keys? by h4rm0ny · · Score: 1

      yes yes, but you have to replace two year olds a year after you get them.

      Have you tried eBay?

      --

      Aide-toi, le Ciel t'aidera - Jeanne D'Arc.
    16. Re:Would it be less tedious to have 10,000+ keys? by Anonymous Coward · · Score: 0

      would cost considerably more than a two year old.

      How much does a two year old cost? And where can they be bought?

    17. Re:Would it be less tedious to have 10,000+ keys? by KlaymenDK · · Score: 1

      You cannot have a language that is composed of a ton of glyphs but yet also have some extremely simple, small, entry system.

      Digitizer and stylus?

    18. Re:Would it be less tedious to have 10,000+ keys? by Drinking+Bleach · · Score: 1

      Indeed, his ranting on Java in particular is somewhat silly given this; it's been fully Unicode safe since the very beginning. If you can type the glyphs directly and don't want to bother with codepoints, then fine, you just do it, assuming you have an editor that is UTF-8 safe, but well, almost all are.

    19. Re:Would it be less tedious to have 10,000+ keys? by TheLink · · Score: 4, Insightful

      So how are you going to tell the difference between:
      a) a hyphen
      b) a dash
      c) a minus sign

      And worse the different unicode versions of hyphens and dashes:

      http://en.wikipedia.org/wiki/Hyphen#Unicode
      http://en.wikipedia.org/wiki/Dash#Common_dashes

      Yes, there's more than one unicode hyphen and dash! There are plenty of confusing characters like that too.

      So for programming you're still going to have to stick to a subset for keywords and symbols, and not use the full "tons of glyphs". Or at least you're going to need an entry system that allows you to switch.

      Maybe that Poul guy just wants a few extra symbols for some stuff. Good luck with that, many already complain about perl :).

      --
    20. Re:Would it be less tedious to have 10,000+ keys? by Yvanhoe · · Score: 1

      As a french keyboard user and C programmer, I must say that I curse the fact that you have to use Alt-Gr for { [ @ # or |
      Sure, you get used to it, but it is still less comfortable than using a qwerty keyboard (which I sometimes do).

      However, I am too snobbish to write my French emails without accents so I guess I have to accept compromise. Most people here use "é" more often than "{".

      --
      The Wise adapts himself to the world. The Fool adapts the world to himself. Therefore, all progress depends on the Fool.
    21. Re:Would it be less tedious to have 10,000+ keys? by Svartalf · · Score: 1

      Good luck with that. The systems out there barely handle things like standard English handwriting. The Chinese systems more or less allow them to enter in the glyphs, but they're still more-or-less doing graphics and not actual symbol entry except for a small subset of the glyph set.

      --
      I am not merely a "consumer" or a "taxpayer". I am a Citizen of the State of Texas
    22. Re:Would it be less tedious to have 10,000+ keys? by SharpFang · · Score: 1

      Interestingly, Dasher, amongst its many natural languages dictionaries, has a few programming languages dictionaries in it, meaning if you want to write Perl, it will guess the right keywords and suggest braces, special characters, quotes etc. where they belong.

      --
      45 5F E1 04 22 CA 29 C4 93 3F 95 05 2B 79 2A B2
    23. Re:Would it be less tedious to have 10,000+ keys? by bytesex · · Score: 1

      I worked for a bank once, that would force _programmers_ to work with smart keyboarding - you had to type a space every time you did a ' or a : for example. I gave up that job very quickly. Not everybody is a business man in your organization and if you can't recognize that, you're not doing your managing very well.

      --
      Religion is what happens when nature strikes and groupthink goes wrong.
    24. Re:Would it be less tedious to have 10,000+ keys? by Razalhague · · Score: 1

      You know, it's very easy to switch between keyboard layouts. I always switch to a US layout when programming.

    25. Re:Would it be less tedious to have 10,000+ keys? by h4rm0ny · · Score: 1

      I'm in a similar position, but do it the other way around. I use an English (UK) keyboard and use the Alt-Gr to get my é. I find that quite natural as I think of it as e with an accent so my brain just goes 'right, press the accent key when I type'. Or as I'm using KDE 60% of the time, it goes 'right, press the accent key-combo before I type', but that's splitting hairs. In either case, I think it would be a lot less intuitive for me to have to fiddle around with Alt-Gr to get { and }. I suppose it depends whether you meant you are a keyboard user who is French, or a user of French keyboards, with the former being the significant element.

      --

      Aide-toi, le Ciel t'aidera - Jeanne D'Arc.
    26. Re:Would it be less tedious to have 10,000+ keys? by Hognoxious · · Score: 1

      French keyboards are also abysmal for editing/writing HTML or XML by hand, or using a linux command line. < and > on the same key - WTF?

      --
      Confucius say, "Find worm in apple - bad. Find half a worm - worse."
    27. Re:Would it be less tedious to have 10,000+ keys? by geminidomino · · Score: 1

      Character recognition/digitizing is many things, but "simple" is not one of them.

    28. Re:Would it be less tedious to have 10,000+ keys? by WillAdams · · Score: 1

      Easy, use literate programming techniques ( http://www.literateprogramming.com/ ) and then typeset them using standard TeX semantics:

      a) a hyphen == ``-''
      b) a dash == em-dash ``---'', en-dash ``--''
      c) a minus sign == $-$ or \[-\] or \{-\} or \begin{equation}-\end{equation}

      More importantly, this allows one to use the full expressability of TeX to show the algorithms which are underlying the relevant code.

      William

      --
      Sphinx of black quartz, judge my vow.
    29. Re:Would it be less tedious to have 10,000+ keys? by Anonymous Coward · · Score: 0

      Also, is it any more tedious than any Latin/Germanic language that only uses a small character set? While you may enter more characters than final glyphs, do you enter more characters than you would to express the same idea in French or English?

      I don't know any Japanese, but I do know the basics of a few Latin/Germanic languages. French and English is possibly the least expressive, most tedious and long-winded languages in that group. Why not compare Japanese to more representative (and more effective/expressive) Latin/Germanic languages. What takes a chapters of English/French text can usually get boilded down, by a good translator, to a few paragraphs in other Germanic languages, without loss of any information, neither hard data nor sentiment.

      E.g. Why do you think mathematical and logical notation is mostly based on the language structures of German and Latin and not English or French. Why do you think the programming language Perl is a thinly veiled Latin. Why do you think Python is a thinly veiled German (or is it Dutch?). Why do you think Simula, and all OO concepts, is based on Scandinavian language constructs. And why has no good computer language ever came from English: Cobol, AppleScript and Basic is improvements of the English language, but they are still a pain to program in.

    30. Re:Would it be less tedious to have 10,000+ keys? by Dishevel · · Score: 1

      A two year old child build it yourself kit can cost a serious fuckwad of money.

      --
      Why is it so hard to only have politicians for a few years, then have them go away?
    31. Re:Would it be less tedious to have 10,000+ keys? by sznupi · · Score: 1

      Certainly would be welcomed by some as a sign of an impending technological singularity.

      --
      One that hath name thou can not otter
    32. Re:Would it be less tedious to have 10,000+ keys? by Hognoxious · · Score: 1

      It's quite difficult when they've locked the machine down so tight that you can't. And yes, I have worked somewhere where they did that. They said it would be a security risk.

      --
      Confucius say, "Find worm in apple - bad. Find half a worm - worse."
    33. Re:Would it be less tedious to have 10,000+ keys? by Hognoxious · · Score: 1

      I've seen military campaigns that were quieter and less messy than my little nephew.

      --
      Confucius say, "Find worm in apple - bad. Find half a worm - worse."
    34. Re:Would it be less tedious to have 10,000+ keys? by slyrat · · Score: 1

      Because that's what you find in JIS X 0213:2000. Even if you simplify it to just what is needed for basic literacy, you are talking 2000 characters. If you have that many characters your choices are either a lot of keys, a lot of modifier keys, or some kind of transliteration which is what it done now. There is just no way around this. You cannot have a language that is composed of a ton of glyphs but yet also have some extremely simple, small, entry system.

      You can have a simple system with few characters, like we do now, but you have to enter multiple ones to specify the glyph you want. You could have a direct entry system where one keypress is one glyph, but you'd need a massive amount of keys. You could have a system with a small number of keys and a ton of modifier keys, but then you have to remember what modifier, or modifier combination, gives what. There is no easy, small, direct system, there cannot be.

      Also, is it any more tedious than any Latin/Germanic language that only uses a small character set? While you may enter more characters than final glyphs, do you enter more characters than you would to express the same idea in French or English?

      Well with Japanese at least it isn't so bad. Essentially one can write / read japanese using hiragana. So the characters are just typed in as they sound. When a combination of characters makes a hiragana character it is usually put in, then when the word is complete (space or end of sentence indicator) it preselects the most likely Kanji character to replace those characters. If it isn't the most used kanji you can hit down or use the mouse to pick the right character. Usually it does a decent job of picking since there aren't too many choices for a given word. In the cases of conjugation or foreign words you can have it just leave the hiragana or switch it to the katakana. Hope that helps for those that haven't tried typing Japanese on a computer.
      Oh, it should also be noted that some Japanese keyboards have the hiragana characters on the keyboard rather than the Alphabet. Though it seems to mostly be as small characters in the corner of the keys.

    35. Re:Would it be less tedious to have 10,000+ keys? by Hognoxious · · Score: 1

      It is too bad that we still need keys at all. Even the most powerful computer is still incredibly dumb compared to my 17-month-old grandson.

      I hear Stephen Hawking's not very good at football.

      --
      Confucius say, "Find worm in apple - bad. Find half a worm - worse."
    36. Re:Would it be less tedious to have 10,000+ keys? by Type-R · · Score: 1

      Really? How about something new crazy like a pen?

      Have you ever tried writing Kanji? Or used a Kanji dictionary to find the correct symbol? It makes English, and it's "every rule has an exception" methods seem simple :-)

    37. Re:Would it be less tedious to have 10,000+ keys? by Jurily · · Score: 1

      As a french keyboard user and C programmer, I must say that I curse the fact that you have to use Alt-Gr for { [ @ # or |

      How about switching layouts? I use Hungarian when writing Hungarian and US when coding or writing English. You'll also learn to not look at the keyboard really quickly.

    38. Re:Would it be less tedious to have 10,000+ keys? by Anonymous Coward · · Score: 0

      anecdotally I find that English tends to be capable of expressing the same thing more compactly than Asian languages with large character sets. In my work the Asian localized version of our software is always larger than the English, and frequently we have to make ui tweaks to accommodate for the fact that the Asian language text takes more screen real estate than English.

    39. Re:Would it be less tedious to have 10,000+ keys? by StikyPad · · Score: 1

      there are laws about getting them to do useful work.

      Only if it's someone else's child. Your own kids can do all the useful work you can threaten them into completing!

    40. Re:Would it be less tedious to have 10,000+ keys? by cayenne8 · · Score: 1
      "When you have a need to enter accented characters, you also get in the habit of using the Alt Gr key pretty quickly. "

      What is a Gr key? I don't believe I've ever seen that key before?? Are you using a special keyboard?

      --
      Light travels faster than sound. This is why some people appear bright until you hear them speak.........
    41. Re:Would it be less tedious to have 10,000+ keys? by Anonymous Coward · · Score: 0

      E.g. Why do you think mathematical and logical notation is mostly based on the language structures of German and Latin and not English or French. Why do you think the programming language Perl is a thinly veiled Latin. Why do you think Python is a thinly veiled German (or is it Dutch?). Why do you think Simula, and all OO concepts, is based on Scandinavian language constructs. And why has no good computer language ever came from English: Cobol, AppleScript and Basic is improvements of the English language, but they are still a pain to program in.

      Perl is thinly veiled Latin? Python is German? OO concepts are based on Scandinavian language??? What planet are you from?

      The only justifiable one you mention is AppleScript, which was deliberately designed to try to look like natural English in a misguided attempt to make it easier to learn for English speakers. That didn't work out so well, perhaps because making programming languages like human languages is not such a good idea.

    42. Re:Would it be less tedious to have 10,000+ keys? by electrosoccertux · · Score: 1

      Reminds me of a compile issue I had a few months ago.
      There was no documentation on the compiler. It's a dead binary. Yet we still use it for whatever reason.

      Anyways, I was getting compile errors that made no sense, which I could not seem to get rid of.
      I stared at my computer screen for about 2 hours before noticing
      that there was a difference between
      "
      and
      ``
      Now, that second one is actually two slanted something-or-the-others, but it's the closest I could get to a double-quotes with an inward slant.
      What had happened was I wrote some code in Notepad++ and then copied into Notepad, saved, and then tried to compile.

      Good times.

    43. Re:Would it be less tedious to have 10,000+ keys? by h4rm0ny · · Score: 1

      Bottom right of a standard UK keyboard layout, adjacent to the space bar, has a key normally labelled "Alt Gr". Different OS's interpret it differently, but it's generally used in conjunction with another key to get characters that are related but different to what you would get without it. For example on Windows 7, you can depress Alt Gr and the e key and you get é (accented e). On default KDE, you can press Alt Gr + the ; key and that primes the OS to give you an alternative version to whatever you type next, so pressing e after that combination gets you the same accented character as on Windows. The advantage with the KDE version is power - you can for example press Alt Gr + the ' key and followed by the e key, you'd get ê (e with circumflex). The combinations are fairly consistent and you quickly get used to it. Not sure what it's like on US keyboards. I'd thought it was similar but if you're in the US, then maybe I'm misremembering.

      --

      Aide-toi, le Ciel t'aidera - Jeanne D'Arc.
    44. Re:Would it be less tedious to have 10,000+ keys? by Anonymous Coward · · Score: 0
    45. Re:Would it be less tedious to have 10,000+ keys? by Anonymous Coward · · Score: 0

      >You cannot have a language that is composed of a ton of glyphs but yet also have some extremely simple, small, entry system.

      Japanese does this quite well.

      In some ways these "glyph" languages are more suited for the digital era because most of the drawbacks of transcribing the language are reduced when using a computer (vs paper). Typically, given a certain length, you can represent far more information in Japanese than you can in English. This is partially because of the grammar and also partially because Chinese characters take up less space than English words -- meaning you need fewer keystrokes per word, fewer words per sentence, and less space per word.

      Even Chinese, which has a grammar just as redundant as English tends to have shorter sentences because of the character set used. Smaller sentences mean you move your eyes less which means you read faster.

    46. Re:Would it be less tedious to have 10,000+ keys? by ozphx · · Score: 1

      Odd, I find your mum only charges a couple of bucks.

      --
      3laws: No freebies, no backsies, GTFO.
    47. Re:Would it be less tedious to have 10,000+ keys? by Dishevel · · Score: 1

      I see that your mom was able to pop out a permanent 2 year old.

      --
      Why is it so hard to only have politicians for a few years, then have them go away?
  45. Ok, stop. by Kagetsuki · · Score: 1

    Unicode/UTF(8) compatibility as a base feature of the language - very good, I fight constantly with languages and code conversion because some dipshit didn't realize some people want to use mulitbyte strings. What's worse is people like Microsoft who assume they can just add crap to files to specify they contain multibyte strings (like their "BOM" for UTF8 - add that and you'll never read the file properly again in anything but Visual Studio).

    Unicode/UTF(8) compatibility within the language (function names, variable names) - questionable, but it would be nice sometimes. Some languages already do this (I think I've seen it in ruby even?). You would make your code unreadable to someone who didn't ready your language but sometimes that could be a good thing, and hey worst case scenario run the code through a translator.

    Unicode/UTF(8) is required to enter the language - NO. WHY WOULD YOU DO THIS?

    1. Re:Ok, stop. by KarmaMB84 · · Score: 1

      The optional BOM in UTF-8 was the doing of Unicode. Even though UTF-8 only has one possible byte ordering, they allowed an optional BOM.

    2. Re:Ok, stop. by Carewolf · · Score: 1

      If I remember correctly the BOM was undefined for UTF-8, but it is a useful extension from its defined use in UTF-16

    3. Re:Ok, stop. by Kagetsuki · · Score: 1

      GCC/G++/AS all choke on UTF8 with BOM as do many other things. "Standard" and "can actually be used" are different things and I swear to god if you send me a source file in UTF8 with BOM I'll find you and break your fingers.

  46. You cannot attach a royalty payment to ASCII... by GumphMaster · · Score: 1

    You cannot attach a royalty payment to ASCII so clearly, in this enlightened age when even implementing public APIs risks copyright litigation, we need to move away from this dangerously socialist encoding. We need a new encoding so that the relevant "owners" of the "intellectual property" embodied in the computer language can bill appropriately for your level of use of their "artistic endeavours". Each ideogram should contain encoded information on the rights owner so that corporate publisher birth-rights can be honoured in perpetuity on a per-instance basis. Of course, such encoding should be preserved through compilation to enable the collection of royalty payments for each end-use of the system also. After all, it's only fair.

    --
    Patent litigation: A doctrine of Mutually Assured Destruction... in which everyone seems willing to push the button
  47. Tried coding in Japanese by ook_boo · · Score: 1

    About 15 years ago I worked in a Japanese office where the database had its own scripting language. The company that created the database had translated all the keywords into Japanese and made it so that it would display correctly, so IF --> , etc. Further, you could flip back and forth between English and Japanese versions easily and not have problems with the compiler. But not one of the Japanese programmers used the Japanese version. They thought it was just weird, and they'd already learned how to use IF in English anyway. I suspect using non-ASCII symbols is a solution without a problem.

  48. RE: Poul-Henning Kamp ,,, by Anonymous Coward · · Score: 0

    In the words of Ozzy ... "This looser should just fuck off and stop this insane shit man.
    What an idiot this kemp guy. He'd be better off and a lot happier as a gay transvestite.
    Come on, What a pervert the guy.
    So 'e wants the Bank of Britan to scrap the old and reliable PDPs. Fat chance pervert
    boy Poul-Henning Kemp. Fuck off man! Why not get a jpb giving male-massuage at
    the local parlor man. Oh, excuse me... Kenp thingy got not balls. There's the story."

  49. UTF-8 from Thompson & Pike by OrangeTide · · Score: 1

    Thompson invented UTF-8 and Pike and others implemented UTF-8 in Plan 9. I think your Unicode fetish owes a debt to Pike and his colleagues!

    --
    “Common sense is not so common.” — Voltaire
    1. Re:UTF-8 from Thompson & Pike by rubycodez · · Score: 1

      UTF-8 for *using* Plan 9, but I was just over at Bell's Plan 9 website browsing the source code and it's C written in ASCII.

      So let's face it, UTF-8 is what you can use on your system which was programmed in ASCII. ASCII, the standard of computing, written in granite and never to be changed as long as we have binary digital computers.

    2. Re:UTF-8 from Thompson & Pike by Anonymous Coward · · Score: 0

      I think my comment deserved a better response than this. I can't even make sense of it.

  50. Re:Microsoft Visual Studio allows Unicode identifi by OrangeTide · · Score: 1

    now we need to discuss if this is a good thing or a bad thing. I'm going to cast my vote for Bad Thing.

    --
    “Common sense is not so common.” — Voltaire
  51. One place it is needed. by Anonymous Coward · · Score: 0

    The only place i really want unicode is directly in the strings i type. When i type a wstring in C/C++ it'd be nice to type or copy-paste the Kanji directly in between my quotation marks rather than the unicode codes. The actual language keywords and standard library function can be in any language as long as my keyboard can type them.

  52. Unicode programming? by Anonymous Coward · · Score: 0

    I'll starting coding with Unicode when Americans can spell COLOUR correctly.

  53. The sun never sets on the British Empire by Anonymous Coward · · Score: 0

    Well maybe it did for a couple decades after WW II, but ASCII brought it right back again.

    Rule, Britannia! ASCII rules the waves...

  54. it's not ASCII to blame by lkcl · · Score: 4, Insightful

    the point has been entirely missed, and blame placed on ASCII [correlation is not causation]. when you look at the early languages - FORTH, LISP, APL, and later even Awk and Perl, you have to remember that these languages were living in an era of vastly less memory. FORTH interpreters fit into 1k with room to spare for goodness sake! these languages tried desperately to save as much space and resources as possible, at the expense of readability.

    it's therefore easy to place blame onto ASCII itself.

    then you have compiled languages like c, c++, and interpreted ones like Python. these languages happily support unicode - but you look at free software applications written in those languages and they're still by and large kept to under 80 chars in length per line - why is that? it's because the simplest tools are not those moronic IDEs; the simplest programming tools for editing are straightfoward ASCII text editors: vi and (god help us) emacs. so by declaring that "Thou Shalt Use A Unicode Editor For This Language" you've just shot the chances of success of any such language stone dead: no self-respecting systems programmer is going to touch it.

    not only that, but you also have the issue of international communication and collaboration. if the editor allows Kanji, Cyrillic, Chinese and Greek, contributors are quite likely to type comments in Kanji, Cyrillic, Chinese and Greek. the end-result is that every single damn programmer who wants to contribute must not only install Kanji, Cyrillic, Chinese and Greek unicode fonts, but also they must be able to read and understand Kanji, Cyrillic, Chinese and Greek. again: you've just destroyed the possibility of collaboration by terminating communication and understanding.

    then, also, you have the issue of revision control, diffs and patches. by moving to unicode, git svn bazaar mercury and cvs all have to be updated to understand how to treat unicode files - which they can't (they'll treat it as binary) - in order to identify lines that are added or removed, rather than store the entire file on each revision. bear in mind that you've just doubled (or quadrupled, for UCS-4) the amount of space required to store the revisions in the revision control systems' back-end database, and bear in mind that git repositories such as linux2.6 are 650mb if you're lucky (and webkit 1gb) you have enough of a problem with space for big repositories as it is!

    but before that, you have to update the unix diff command and the unix patch command to do likewise. then, you also have to update git-format-patch and the git-am commands to be able to create and mail patches in unicode format (not straight SMTP ASCII). then you also have to stop using standard xterm and standard console for development, and move to a Unicode-capable terminal, but you also have to update the unix commands "more" and "less" to be able to display unicode diffs.

    there are good reasons why ASCII - the lowest common denominator - is used in programming languages: the development tools revolve around ASCII, the editors revolve around ASCII, the internationally-recognised language of choice (english) fits into ASCII. and, as said right at the beginning, the only reason why stupid obtuse symbols instead of straightforward words were picked was to cram as much into as little memory as possible. well, to some extent, as you can see with the development tools nightmare described above, it's still necessary to save space, making UNICODE a pretty stupid choice.

    lastly it's worth mentioning python's easy readability and its bang-per-buck ratio. by designing the language properly, you can still get vast amounts of work done in a very compact space. unlike, for example java, which doesn't even have multiple inheritance for god's sake, and the usual development paradigm is through an IDE not a text editor. more space is wasted through fundamental limitations in the language and the "de-facto" GUI development environment than through any "blame" attached to ASCII.

    1. Re:it's not ASCII to blame by santax · · Score: 1

      Very good point. However, even God won't be able to help you once you go into the depths called VI or EMACS. Seriously, I have gotten Emacs to make me coffee and raise my kids, but I still don't know how to save a damn textfile. Don't get me started on VI either. That would truly become a holy war. *starts up edit.exe, white on blue, 80 chars width. It's all I need to tell someone to f* off.

    2. Re:it's not ASCII to blame by BufferArea · · Score: 1

      if the editor allows Kanji, Cyrillic, Chinese and Greek, contributors are quite likely to type comments in Kanji, Cyrillic, Chinese and Greek. the end-result is that every single damn programmer who wants to contribute must not only install Kanji, Cyrillic, Chinese and Greek unicode fonts, but also they must be able to read and understand Kanji, Cyrillic, Chinese and Greek. again: you've just destroyed the possibility of collaboration by terminating communication and understanding.

      This is a project management issue. Many managers might think code is code and programmers are interchangeable, but it is important that programmers can communicate (and thus need to speak a common language). Besides, source code repositories could be adapted for this - just specify what subset of unicode is allowed, and disallow check-ins of files that contain characters outside of this subset.

      then, also, you have the issue of revision control, diffs and patches. by moving to unicode, git svn bazaar mercury and cvs all have to be updated to understand how to treat unicode files - which they can't (they'll treat it as binary) - in order to identify lines that are added or removed, rather than store the entire file on each revision. bear in mind that you've just doubled (or quadrupled, for UCS-4) the amount of space required to store the revisions in the revision control systems' back-end database, and bear in mind that git repositories such as linux2.6 are 650mb if you're lucky (and webkit 1gb) you have enough of a problem with space for big repositories as it is!

      Seriously? In this day and age, the amount of space required for source code should never an issue. Storage space is cheap. If people are serious about a project, getting adequate space for storing the code repository should never an issue.

      but before that, you have to update the unix diff command and the unix patch command to do likewise. then, you also have to update git-format-patch and the git-am commands to be able to create and mail patches in unicode format (not straight SMTP ASCII). then you also have to stop using standard xterm and standard console for development, and move to a Unicode-capable terminal, but you also have to update the unix commands "more" and "less" to be able to display unicode diffs.

      Are there technical reasons why this would not be feasible?

      there are good reasons why ASCII - the lowest common denominator - is used in programming languages: the development tools revolve around ASCII, the editors revolve around ASCII, the internationally-recognised language of choice (english) fits into ASCII. and, as said right at the beginning, the only reason why stupid obtuse symbols instead of straightforward words were picked was to cram as much into as little memory as possible. well, to some extent, as you can see with the development tools nightmare described above, it's still necessary to save space, making UNICODE a pretty stupid choice.

      Those were good reasons in the past. Why can't we move past these reasons now, though?

    3. Re:it's not ASCII to blame by MadMaverick9 · · Score: 1

      the simplest programming tools for editing are straightfoward ASCII text editors: vi and (god help us) emacs. so by declaring that "Thou Shalt Use A Unicode Editor For This Language" you've just shot the chances of success of any such language stone dead: no self-respecting systems programmer is going to touch it.

      Excuse me - vim can handle utf-8 just fine. utf-8 file names and utf-8 content. on a vanilla slackware 13.1.
      http://www.cl.cam.ac.uk/~mgk25/unicode.html#apps
      # Vim (the popular clone of the classic vi editor) supports UTF-8 with wide characters and up to two combining characters starting from version 6.0.
      # Emacs has quite good basic UTF-8 support starting from version 21.3. Emacs 23 changed the internal encoding to UTF-8.
      And svn can handle utf-8 as well - http://svnbook.red-bean.com/en/1.4/svn.advanced.l10n.html.

      The repository stores all paths, filenames, and log messages in Unicode, encoded as UTF-8.

      All it requires is ... set your locale and lang. "export LANG=en_DK.utf8" in "/etc/profile.d/lang.sh" (Slackware 13.1).

    4. Re:it's not ASCII to blame by Aryarnak · · Score: 1

      Excuse me - vim CANNOT handle Unicode fine. Just open any files with Unicode Complex Script. You will know how fine vim handles unicode ? Do you see the following text same in your browser and vim. Just check it--- http://pastebin.com/LdCFTpq1

    5. Re:it's not ASCII to blame by MadMaverick9 · · Score: 1

      unicode - no. utf-8 - yes.

      I just copied and pasted the text from your link into vim and it seems to look the same. I can not read hindi, so I can not check for correctness.
      But I can type thai language into my vim just fine. and thai language is a complex script also.

      Two character codes in the hindi text give problems in vim: and . I don't know if that's a problem with my fonts or what.

      But typing thai language into my vim works fine. I wish I could post an example here, but /. is an "iso-8859-1" site only.

      You could use "iconv" to convert your text from unicode into utf-8 and then open it in vim.

    6. Re:it's not ASCII to blame by MadMaverick9 · · Score: 1

      Two character codes in the hindi text give problems in vim: <200d> and <200c>. I don't know if that's a problem with my fonts or what.

    7. Re:it's not ASCII to blame by Anonymous Coward · · Score: 0

      Every character of Indian script gives problem in vim. It not the problem with your font. It's the vim's issue.

    8. Re:it's not ASCII to blame by Aryarnak · · Score: 1

      unicode - no. utf-8 - yes.

      I just copied and pasted the text from your link into vim and it seems to look the same.

      Can you please show me the screenshot of the working hindi text in vim, i bet it's not the same as in browser ?

    9. Re:it's not ASCII to blame by Anonymous Coward · · Score: 0

      the end-result is that every single damn programmer who wants to contribute must not only install Kanji, Cyrillic, Chinese and Greek unicode fonts

      You just need a single, Unicode font.

    10. Re:it's not ASCII to blame by Anonymous Coward · · Score: 0

      then you have compiled languages like c, c++, and interpreted ones like Python. these languages happily support unicode - but you look at free software applications written in those languages and they're still by and large kept to under 80 chars in length per line - why is that? it's because the simplest tools are not those moronic IDEs; the simplest programming tools for editing are straightfoward ASCII text editors: vi and (god help us) emacs. so by declaring that "Thou Shalt Use A Unicode Editor For This Language" you've just shot the chances of success of any such language stone dead: no self-respecting systems programmer is going to touch it.

      Emacs (and I believe vi too) does support Unicode and has no problems with lines longer than 80 characters.

      then, also, you have the issue of revision control, diffs and patches. by moving to unicode, git svn bazaar mercury and cvs all have to be updated to understand how to treat unicode files - which they can't (they'll treat it as binary) - in order to identify lines that are added or removed, rather than store the entire file on each revision.

      git and svn treat every file as "binary" with full "delta storage" capability.

      bear in mind that you've just doubled (or quadrupled, for UCS-4) the amount of space required to store the revisions in the revision control systems' back-end database, and bear in mind that git repositories such as linux2.6 are 650mb if you're lucky (and webkit 1gb) you have enough of a problem with space for big repositories as it is!

      It is called UTF-8 and uses only as much space as necessary, that is: English with little information per character takes up only one byte and for example Chinese with very much information per character takes up more bytes per character.

      but before that, you have to update the unix diff command and the unix patch command to do likewise.

      Those programs already support UTF-8. In fact SVN will happily give you a diff of two PNGs if not marked as "image/png".

      but you also have to update the unix commands "more" and "less" to be able to display unicode diffs.

      They already support UTF-8. (Who uses "more"?!)

    11. Re:it's not ASCII to blame by JesseMcDonald · · Score: 1

      the simplest programming tools for editing are straightfoward ASCII text editors: vi and ... emacs

      Modern versions of both VIM and EMACS support Unicode text files, provided you use either the GUI interface or a UTF-8 terminal.

      the end-result is that every ... programmer who wants to contribute must not only install Kanji, Cyrillic, Chinese and Greek unicode fonts, but also they must be able to read and understand Kanji, Cyrillic, Chinese and Greek

      Short of forcing everyone to comment their code in English, you're going to have that problem anyway. Fixing it is a matter of enforcing suitable commenting standards; the mere fact that you can comment your code naturally in many languages doesn't mean you have to use them all in the same project.

      then, also, you have the issue of revision control, diffs and patches

      Granted, but this is just a matter of improving the tools slightly. Compared to implementing a whole new language, supporting Unicode in a handful of diff/patch tools and revision management systems would be trivial—not to mention well worth doing in its own right—particularly given the existence of stable Unicode-handling libraries like ICU.

      you also have to stop using standard xterm and standard console for development, and move to a Unicode-capable terminal, but you also have to update the unix commands "more" and "less" to be able to display unicode diffs

      The standard xterm already supports UTF-8. Just put "XTerm.vt100.utf8: 2" in your .Xdefaults, or pass the "-u8" option on the command-line. I don't use "more" very much, but I've never had trouble displaying UTF-8 text with "less" in a UTF-8 locale (LANG="en_US.UTF-8").

      --
      "The state is that great fiction by which everyone tries to live at the expense of everyone else." - Bastiat
  55. Language designers want languages to be used. by bcrowell · · Score: 1

    The big problem here is that language designers want their languages to get used.

    There is a difference between telling your users that they *can* use unicode and telling them that they *have to*. Every language I can think of that said you *had to* use non-ASCII characters is dead: APL, Algol. I don't know about the detailed reasons why nobody actually codes in Algol (maybe just because it was mainly meant as a language for describing algorithms, not for writing practical programs), but APL's absurdly inconvenient character set was surely a reason that it expanded to fill a tiny niche and then quickly died even in that niche.

    *Allowing* programmers to use non-ASCII characters is a lot more reasonable, but this is not exactly the world's biggest innovation. Perl allows you to use unicode characters inside string literals, but it also allows you to use, e.g., Chinese characters as names of variables. Is this a good thing? I guess so, in the sense that choice is good. But what happens when someone who doesn't speak Chinese wants to maintain code that uses Chinese variable names? Sure, we shouldn't be cultural chauvinists, but realistically, every literate Chinese person can recognize the letters of the Latin alphabet, whereas the converse isn't true -- coders in New York or Mumbai can't read Chinese characters.

    There is also a nontrivial issue of look-alike characters, which could be a source of errors. For example, do I really want to be able to have one variable named Y (upper-case Latin Y) and another named Y (upper-case Greek upsilon)?

    1. Re:Language designers want languages to be used. by multipartmixed · · Score: 1

      > There is also a nontrivial issue of look-alike characters, which could be a source of errors.

      You said it, brother. With my aging eyes and problematic hand nerves, I frequently mis-type ; as : at the end of the line. Even though it's a syntactic element and emacs helps me, I still spend 10-12 minutes a week searching for and fixing these.

      If it was a less important symbol, the bugs would be even harder to find.

      --

      Do daemons dream of electric sleep()?
  56. I have a theory. by Kaenneth · · Score: 1

    Any programming language expands until every available set of brace characters is valid in every context.

    () {} []

    take C#... say you have an indentifier 'x', x() is for method calls, x[] is for indexing, {} is reserved for code blocks, and x is for generics.

    I think unicode would be nice for non-english native developers to use indentifiers in their native language, but would lead to an explodion of operators and braces, neighter of which would help readablility of code.

    You could define a language with compd braces, just as C derived languages have += == !=, etc. you could define combo braces, f vs f vs f could each represent different things.

    But it'll all boil down to invoking a method with some paramenters, it's all syntax sugar, just like x[n] to access an indexed item could be x.Lookup(n)

    XML is interesting as something written with XML basically has an unlimited set of braces, "" allowing virtually infinite ways to expand the definition of objects. however, XML would make a very painful base for a programming language.

  57. Flamebait or Troll? by amirulbahr · · Score: 1

    The article borders on the ridiculous. Colour coding blocks of code to mark them private? Yeah, that is much more readable than, say using a sequence of pre-historic ASCII characters like 'private'.

    Nothing wrong with some food for thought and the article certainly gives some. I believe languages can be more verbose as typing is no longer a slow process over a TTY, and source code size is no longer an issue. This does not require new characters, just more actual words.

  58. Re:PROGRAMMERS ARE CONSERVATIVE? by Tridus · · Score: 1

    Laziness is good. Why would I waste time and effort on something that doesn't matter in the slightest, when I could instead do something useful? Or for that matter, do nothing?

    --
    -- "So they told me that using the download page to download something was not something they anticipated." - Bill Gates
  59. Down with programming! by woboyle · · Score: 1

    The thing is, we need to get rid of programming applications altogether. With proper adaptive systems, one should be able to tell the computer what to do, and not worry about the details of how to do it. That is work I started on when at Brooks Automation about 10 years ago (in the division now part of Applied Materials). At an internal developers' conference I once said that my job was to make my job obsolete by the time I retired. Unfortunately, I got RIF'd five years after that, about 10 years before I would be ready to retire... :-)

    --
    Sometimes, real fast is almost as good as real-time.
  60. Need the right keyboard by Todd+Knarr · · Score: 1

    Give me a keyboard with the symbols in question on it directly and I'll agree with him. But if I've got to remember arcane multi-key combinations for symbols not printed on keycaps or immediately obvious from what's printed (eg. dead keys for accents and such), or if I've got to remember 3-digit codes for characters, then it's a no-go and I'll stick to what's on the keyboard.

  61. Re:The thing with ASCII [COBOL 2.0?] by Tablizer · · Score: 5, Interesting

    This proposal isn't about giving programmers more power to code, it's about making it easier for non-english speakers who aren't coders to read the code that their programmers write.

    COBOL was originally designed so that managers and customers could read it. But in practice they rarely did because programming logic is typically too low-level and requires knowing the technical context to understand by a non-programmer and/or non-team member anyhow. Being "English-like" or grammatically proper didn't really help that goal in practice. This is why the idea was abandoned in later languages.

    Perhaps it's comparable to legalese. Making it proper English doesn't necessarily improve readability by non-lawyers. It's still gibberish to most of us without a legal background.

    It's not worth-while to slow down production programmers in a trade for the rare case where non-programmers will want to read code for an actual need (not just curiosity). Thus, it's an uneconomical requirement as long as there is such a trade-off.

  62. Some New Operator Symbols Would be Handy by Anonymous Coward · · Score: 0

    It would be nice to have new symbols for some programming functions.
    For example, there are assembly language mnemonics for things like a 8/16/32 bit rotate left while moving the top bit to the bottom.
    However, they are difficult to express in higher level languages, and the compiler might not code it efficiently depending on the compiler and underlying CPU.

    When I use this instruction to create a shift register, I can code it easier in assembler than in C.

  63. Re:PROGRAMMERS ARE CONSERVATIVE? by WheelDweller · · Score: 0

    Sure; not knocking laziness!

    I'm just saying with one HARD thing to do, which needs to be mastered before you count on it,versus an easy thing you do all the time...that's not being conservative.

    --
    --- For a good time mail uce@ftc.gov
  64. Grep on ascii rules by goombah99 · · Score: 2, Insightful

    Grep on ascii is more than 100x faster for complex string expressions. THere's a lot of good reasons not to use unicode.

    --
    Some drink at the fountain of knowledge. Others just gargle.
  65. Rich editing environments by brantondaveperson · · Score: 1

    Well I don't think anyone here has much of an issue with writing their source code in ASCII - as it's been pointed out ASCII is simple, well understood, sufficient for our current languages and extremely portable.

    But what about comments? What I'd like to get my hands on is an editor that:
    1) Understands utf-8 source code (so we can get nice characters in comments)
    2) Allows diagrams to be embedded in source code as comments. ASCII may be fine for code, but it sure sucks for diagrams.

    Does such a thing exist?

    1. Re:Rich editing environments by tftp · · Score: 1

      1) Understands utf-8 source code (so we can get nice characters in comments)

      MSVC already allows you to do that.

      2) Allows diagrams to be embedded in source code as comments. ASCII may be fine for code, but it sure sucks for diagrams.

      As a business owner, how much you are willing to pay your coders to draw these diagrams in the first place and then maintain them as the code changes? In my experience even plain text comments quickly get out of sync with the code. A complicated drawing will be cast aside at the very first death march - and you will need those marches, given that your coders spent time on drawing fancy pictures instead of coding.

      There is another popular belief that says if you need lengthy comments about some piece of code then probably this piece of code needs to be rewritten.

      I don't want to sound like I'm against diagrams and textual descriptions, but often they are better done in a separate document.

    2. Re:Rich editing environments by brantondaveperson · · Score: 1

      I don't want to sound like I'm against diagrams and textual descriptions, but often they are better done in a separate document.

      Well I don't see why they're any more likely to be kept in sync just because they're in a separate document - in fact it seems to me that they'd be more likely to diverge when they're in another document.

      I have a feeling that it would be valuable to me at least, and I was wondering if a tool (perhaps an MSVC plugin even) existed for the purpose.

      The kind of diagram I'm talking about is (for instance) a geometric diagram that illustrates the reason that the particular bit of maths is being done in this particular way. A complex diagram in this instance in no way indicates that the code needs to be rewritten, and while you're right that it could live comfortably in another document, it would be nice to see it there right beside the code.

    3. Re:Rich editing environments by tftp · · Score: 1

      Well I don't see why they're any more likely to be kept in sync just because they're in a separate document

      That's exactly because they don't have to be synchronized each time you make a small change. You can have the document written before the code, then you update it once or twice during the development, and then you do the final update when you are done and the code is released. This allows you to plan the work on documentation, as opposed to cramming it into an emergency fix when your boss is standing behind you, with airplane tickets in hand. (I had that happen to me more than once.)

      The kind of diagram I'm talking about is (for instance) a geometric diagram that illustrates the reason that the particular bit of maths is being done in this particular way.

      Per UNIX philosophy, combine existing tools instead of making a new one. Which means, draw your diagram in Visio, embed it into a MS Word document and be happy.

      Besides, it is always better to have a document that, though still confidential, can be given to a 3rd party for review or for integration purposes without sending them the source.

  66. So why TEXT at all? by blcss · · Score: 1

    Source code is chock full of inherent structure. Why confine ourselves to flat text that has to be parsed? If we're going to invent yet another new programming language that forces us to throw out all our old code, then we may as well go for broke. Make it some binary format that encapsulates all the structure, work with using an IDE that understands the format and represents it visually. We don't even to all agree on the visualization.

    --
    We don't need yet another new programming language. Let's just pick an existing language and fix its flaws.
  67. Why binary? by Arancaytar · · Score: 1

    Restricting digital storage to ones and zeros is needlessly polarizing and limiting. Why not allow a 0.5 bit value?

    1. Re:Why binary? by Anonymous Coward · · Score: 0

      true/false/maybe ?

    2. Re:Why binary? by nu1x · · Score: 1

      That's what quantum computing is for.

      --
      I have nothing to lose but my bindings.
  68. How about more than text? by BufferArea · · Score: 1

    Why should code be tied to text only anyway? I know there have been some experiments that never really took off, but even if we could expand programs to more than simple text just for comments that would be a huge help. A diagram or picture can often more accurately, and quickly, convey how a piece of code should work than a long piece of text. It would also be nice if we could reference non-code files from a code file. How about linking a class or method to a specification document (or part of it)? It would also be nice if you were alerted to check correctness of the linking code if the relevant section of the specification document changed.

    We currently write source code as the compiler is the only consumer of the file that matters and that humans are some inconvenient aspect that we begrudgingly make the code accessible to. Thinking of people as first class consumers of source code may have a significant impact on programming.

  69. This isn't the problem by Anonymous Coward · · Score: 0

    The problem is I/O that (still) isn't 8-bit clean and the setlocale, wchar bullshit in the C std library (compare with Plan-9).

    We can talk about using non-ascii glyphs for syntax when we can easily and reliably display UTF-8 everywhere.

  70. Non-issue by bonch · · Score: 1

    This seems like one of the least important issues about today's programming languages. Is anyone having problems because their source code uses ASCII? The guy even suggests making color a part of the language syntax, such as marking protected regions with gray frames. The problems with these ideas (which are not original) are almost an entire article themselves. An amusing Sunday night article, but no thanks.

  71. Go Cry at the Romans by swdunlop · · Score: 1

    If a character set from the 60's is the only legacy standard we carry forwards in programming, we're doing pretty good. Look at how axle length of Roman chariots has dominated transportation systems -- http://www.associatedcontent.com/article/390903/how_the_romans_influenced_the_space.html

    1. Re:Go Cry at the Romans by Zobeid · · Score: 2, Informative

      I've read that story before, and it's very neat. It's just too bad there's so little truth to it. Here's an example where it really falls apart: "As the railroads were built they were built using the same standard width of all the wagons since the tools had been standardized to that width." Anybody with casual knowledge of railway history should remember the crazy profusion of different -- widely varying -- gauge standards in the early days.

    2. Re:Go Cry at the Romans by rubycodez · · Score: 1

      but the lion's share are within +/- 4 inches of "standard gauge" a.k.a "international gauge"

      two-thirds the rails on the planet are standard gauge. most of the rest are damn close.

  72. Poul-Henning Kamp writes in English. by hoggoth · · Score: 1

    I tell you what, Poul-Henning Kamp... when you can write your argument concisely and clearly in Unicode symbols instead of English language using "plain ASCII text" I'll consider it.

    Hypocrite.

    --
    - For the complete works of Shakespeare: cat /dev/random (may take some time)
  73. Hello World by kybred · · Score: 1

    You've not seen Hello World until you've seen it in the original Klingon!

    1. Re:Hello World by rubycodez · · Score: 1

      That's all we need, for Larry Wall to hear of Var'aq. He'll hold up rolling out Perl 6 even further until he gets Var'aq features in there.

  74. Missing the point. by Anonymous Coward · · Score: 0

    The original post completely misses the point.

    Sure, we could replace "if" with a cute icon representing puzzlement and "for" with some kind of circular arrow - but that wouldn't change the nature of the programming language. ASCII permits us to generate a nearly infinite number of "symbols" - we call them "words" - adding more symbols into the character set would do very little for the actual nature of the language we're working in.

    What makes programming hard is not the typing of the actual characters - but the logical thinking behind those characters. Most programmers can easily type faster than they can think (and those who claim to be able to think more quickly than they type are to be avoided since they are most likely thinking shallowly and turning out poor code).

    Speeding up the entry of the "symbols" by replacing words with icons or new characters simply doesn't help.

    I also have my doubts that it would speed our code entry anyway. Modern keyboards are about the perfect size for two hands. They allow input at the maximum possible bandwidth by allowing all ten fingers to reach as many symbols as possible with reasonable motion distance. If you have more symbols in your character set - then you need more multi-key operations - and you don't gain bandwidth. Picking symbols with a mouse is also ineffectual - it's like typing on an on-screen keyboard - and we know how much that sucks compared to the real thing.

    Finally, this is far from a new idea. The language APL uses a wild profusion of non-ASCII characters...an that's the single feature that can be blamed for it's failure to become more popular.

  75. Source code reading need NOT = source entered by Anonymous Coward · · Score: 0

    What if, as suggested partially in posts above, we display source code using Unicode, but allow editing it in ASCII?

    I have used APL on a keyboard manufactured specifically for that purpose (IBM, in the 1980s, on a 3277 terminal). While the language was terse, it was comprehensible. Where it failed was that I had to use a special terminal to edit the code. If I wasn't on that terminal, I was effectively locked out. That wasn't good. Worse, in my opinion, was the time I spent hunting for the right key to press to entry a particular character - sure, I learnt the frequent characters very quickly, but the less frequent demoted me to a hunt-and-peck typist.

    What I suggest is that we use Unicode to represent our code on display, but we enter it using the keyboards we have (or special ones, if we have them). Let me type for right arrow, and so forth. Allow HTML or XML type shortcuts for more obscure characters - let me type if I need a left up arrow - don't make me type a meaningless sequence like \u123 (but allow it if I happen to know it).

    The idea that the source code I see and the source code I enter have to be the same is old-fashioned. I edit source code on a specialised editor that barely uses the resources of the PC or Mac it is running on. The cost of parsing code back and forth between reading form and editing form would be minimal - consider that many source code editors provide instant source code error detection already - that requires parsing of the code on the fly.

    All up, I think altering the paradigm between what I type and what I see is an appropriate solution to this problem. What do you think?

  76. What a terrible idea by SteeldrivingJon · · Score: 1

    I don't ever want to be stuck maintaining a system written by some dork who thought it was a great idea to write crucial components in Unicode Ogham runes.

    --
    September 2011: Looking for Cocoa/iOS work in Boston area Cocoa Programmer Quincy, MA
    1. Re:What a terrible idea by codematic · · Score: 1

      I Agree, this is REALLY stupid... industry can't even agree on an open calendar format that EVERYONE will use... let alone some hideous thing like this. The reason ASCII has lasted and keeps lasting is because corporations haven't tried to make it all proprietary, and screw up the interoperability.

    2. Re:What a terrible idea by rubycodez · · Score: 1

      ASCII also lasts for another reason, For better or worse, the international language for diplomacy, trade and tech of this planet is English. Thousands of programming languages use English, a few dozen don't and for most of those it's an optional choice to not use English. English is expressible in ASCII, and computer are programmed with shells, compilers and interpreters that are fed English.

      http://en.wikipedia.org/wiki/Non-English-based_programming_languages

  77. Article author didn't read spec by kongtomorrow · · Score: 2, Informative
    Mr. Pike _did_ tear down the wall. The author didn't read the spec for Pike's language. From the article:

    Unicode has the entire gamut of Greek letters, mathematical and technical symbols, brackets, brockets, sprockets, and weird and wonderful glyphs such as "Dentistry symbol light down and horizontal with wave" (0x23c7). Why do we still have to name variables OmegaZero when our computers now know how to render 0x03a9+0x2080 properly?

    The go spec is defined in terms of unicode, and specifically gives non-ascii characters as example identifiers. Go source code is defined to be UTF-8.

    1. Re:Article author didn't read spec by kongtomorrow · · Score: 1

      I also wonder if Poul-Henning Kamp is aware that Rob Pike is the co-inventor of UTF-8...

    2. Re:Article author didn't read spec by Anonymous Coward · · Score: 0

      That's hardly unique to it. Languages as obscure as Java and JavaScript permit Unicode identifiers.
      They're just not used much.

  78. they already did it by islon · · Score: 1

    ...a language full of symbols to represent lots of different stuff: perl a shame it's write only

  79. Neat by jav1231 · · Score: 1

    It's like a geek version of "What Not To Wear." Only code.

  80. But wait... by Anonymous Coward · · Score: 0

    The reason we have so many human languages is that for most of human history, people couldn't communicate with others who lived more than a few miles away. That problem has been solved, so eventually we'll have one language that everyone speaks.

    But the reason we have so many progamming languages is that each one represents a different set of tradeoffs between expressiveness, efficiency, portability, high- or low-level constructs, etc.

    We have so many programming languages for the same reason that a woodworker has so many different tools: they are each useful for different things. Sure, you might be able to use a generic chisel in place of several other more specialized tools, but its not *optimal* for any of the tasks that those specialized tools are designed for. And thus it is with programming languages. C/C++ are good for low-level apps, Java for big bloated enterprise apps, Python or Ruby for clever apps that need to be written in a hurry and don't have to be very efficient, and so on.

    1. Re:But wait... by Twinbee · · Score: 1

      We have so many programming languages for the same reason that a woodworker has so many different tools: they are each useful for different things.

      I'm sure there are many new tools that combine the functions of older tools. Also imagine a futuristic universal glue which fused together two items more securely than any amount of superglue or nails could ever do, yet which unfastened cleanly at the flick of a switch. You can see how that would immediately deprecate hammers, nails, hooks, and the millions of glues available on the market.

      Java for big bloated enterprise apps

      You said it. Maybe you didn't quite mean this, but Java is indeed far more bloated than it should be. Wouldn't it be nice to get the speed of C/C++ but the syntactic simplicity of say Python or Ruby? The science behind achieving that is incredibly hard, but by no means impossible.

      At the most, yes, maybe we should split an entirely new paradigm like declarative programming off from the mainstream imperative styles, but even that itself would cut down 99% of all the invented programming languages.

      And yes, maybe, we should even combine logic, functional, and procedural into one melting pot. Of course, it would have to be incredibly well designed, without ambiguity, with a minimal syntax, and BASIC-like simplicity, yet as powerful and flexible as C++ or Haskell.

      I still think that's possible, but would require a massive worldwide coordinated effort some decades down the line, when we know a lot more about the science behind it all, when CPU architectures have settled down more, and where competition in the programming language arena isn't as needed as much as it is now.

      --
      Why OpalCalc is the best Windows calc
  81. visual GUI-based programming by Zobeid · · Score: 1

    If you think ASCII is a straightjacket, you're not going to break out of it merely by moving to a larger character set. You have to grow beyond character-based, text-based programming. The way you do that is with a GUI IDE.

    I could easily point to the old CanDo programming environment on Amiga, or to Smalltalk (including Squeak), or Hypercard, or various visual GUI programming tools starting with Apple's and moving forward from there. The point being. . . All of them included ASCII-based program code, but they supplanted it to varying degrees with GUI-based structure. In the more advanced examples (such as CanDo), you could create simple-but-useful programs using only the mouse, whereas typing code was required only for advanced features.

    I'm disappointed, actually, by how visual programming has stagnated. I blame the cult of Unix/Linux to some degree. The whole OS and all its tools and standards are based on ASCII text, and it's very hard for coders to get out of that mindset after growing up with it. The internet too, which was built on a foundation of Unix and HTML, is a pretty backwards place when it comes to GUI operation. Large parts of it still need to catch up with the late 1980s, to say nothing of the 21st Century.

    1. Re:visual GUI-based programming by rubycodez · · Score: 2, Funny

      visual programming has stagnated because it produces crap. Exhibit A, Microsoft Windows. Exhibit B, all Microsoft Applications not acquired by Microsoft.

      GUI code wizard 'tards, hated to have them on my coding teams....

    2. Re:visual GUI-based programming by MadMaverick9 · · Score: 2, Insightful

      I blame the cult of Unix/Linux to some degree. The whole OS and all its tools and standards are based on ASCII text

      you ever heard of the nls_utf8 kernel module? ever seen the "LANG" environment variable? set it to "en_DK.utf8" for example and you're ready to go.
      vim, svn, rm, mv, cp can handle utf8 just fine. this being on slackware 13.1.

    3. Re:visual GUI-based programming by santax · · Score: 4, Insightful

      Visual programming isn't big for the same reason people talk and not use drawings to communicate in day to day life. A decent well explained and understood language is faster, universal and more convenient. Drawings are used in situations where you can't communicate true a spoken or written language. As a replacement tool. It's very basic since with a spoken or written language you can uniformly have so much more precise interpretation of your intentions. Same goes for visual programming at this moment in time. I won't say there isn't a future for it, but as a replacement tool for the tried and tested programming environments it has a long way to go. Come up with a visual programming system for writing actually sophisticated code and you might have yourself a winner. Only party that comes in mind is Labview from NI.

    4. Re:visual GUI-based programming by santax · · Score: 1

      True = Through.

    5. Re:visual GUI-based programming by Anonymous Coward · · Score: 0

      Yes, all you need to do is remove the syntax and focus on the operators and operands. Make the operators work on values inside "desktop objects" that are type oriented (different "calculators" for numbers, dates, strings, mapping keys to values, text, hierarchical/XML/forms/trees, database access, networking, scripting, etc) and use programming by example/demonstration. I've never seen LabView, so I can't tell you how it compares to something like this: http://www.dsmforum.org/events/DSVL01/carlson.pdf Just looking at Wikipedia, LabView is a dataflow language, and TWB/TE is a procedural language. TWB/TE is just the first step that can be taken. One could create a multithreaded object-oriented stack environment (MOOSE), where you could have multiple recorders, collections of desktop objects, and more. TWB/TE is currently like one huge flowchart. This could be broken up into pieces and reused.

    6. Re:visual GUI-based programming by Anonymous Coward · · Score: 1, Insightful

      I _will_ say there isn't a future for visual programming, except perhaps in very limited domains, and even then, with a text language backup that people drop into for nontrivial applications.

      There have been hundreds of commercial and academic attempts at this, including the horrible CASE tool fad of the 90's, of which I was a victim in my first job out of school. I tool a grad course in visual programming in 1989 and I knew then that it would never work. Editing text is simple and almost one-dimensional (the layout is 2D but to insert things you add new lines). This makes revising programs easy - add and delete lines, sometimes move things between lines, and the editor just pushes the surrounding code out of your way. Editing 2D diagrams (such as CASE tool diagrams in the 90's) is a nightmare, because you have to think about both the logic of your program and the layout of the diagram (moving things around so your changes will fit, then moving them for an hour more to make it pretty). The latter is a really annoying distraction.

      Even if the tool fixed that problem, you'd run into the next problem, which is that diagrams are not very expressive of complex relationships between components, especially implicit ones such as types or templates. There's a reason ASIC and FPGA designers moved _away_ from drawing logic diagrams in a circuit diagramming tool to Verilog and VHDL (text programming languages) for hardware design in the 90's - as chip and FPGA designs became more complex, a textual design language with abstraction that is hard to do graphically became necessary. Circuit board designers still use diagramming-based CAD because they still place components, holes, and layers by hand - when laying out a board you usually want some of the components to have specific locations (like the connectors), and there are few enough components that placing them by hand still works. But for hundreds of thousands or millions of components (lines of source code, or logic gates), language-based design (as opposed to graphical) provides the necessary abstraction tools and editability to get the job done.

      And also, how the hell would you diff two code-diagrams to determine what changed between the last working version and the current one?
      There are just too many working tools for text-based programming to start using something different.

      The submitter wasn't really talking about diagrammatic programing so much as expanding the symbol set from ASCII, but the subject was close enough to detonate the above rant. On the symbol-set question, I agree with everyone else that it will be an incompatible waste of time. People complain that English is a poorly designed language, with obtuse spelling rules and whatnot - but it has over a million words and everyone is learning it. A language can have a lot of flaws that make it hard to learn, but people will learn it anyway if they have a reason, and won't even see the flaws once they know it. You spend a lot more time using a language than learning it, so it's not worth optimizing ease of learning if doing so will cause a bunch of other problems with usability. I wish Microsoft would learn that about their Office UI - revising something everyone knows in a mature product is a waste of resources that would be better spent fixing bugs. But that's another rant...

    7. Re:visual GUI-based programming by Anonymous Coward · · Score: 0

      I meant imperative, not procedural.

    8. Re:visual GUI-based programming by jimicus · · Score: 1

      Don't, you're bringing back memories of my first year CS degree - modules with things like "Rich pictures" (which today I suspect was a euphemism for "I cannot express myself clearly so I'm hoping that if I draw a little picture instead I won't be asked to do so").

    9. Re:visual GUI-based programming by sco08y · · Score: 1

      Visual programming isn't big for the same reason people talk and not use drawings to communicate in day to day life. A decent well explained and understood language is faster, universal and more convenient. Drawings are used in situations where you can't communicate true a spoken or written language.

      My first thought was, "hold on, what about Powerpoint?" but then I realized that just proved your point.

    10. Re:visual GUI-based programming by DragonWriter · · Score: 1

      Visual programming isn't big for the same reason people talk and not use drawings to communicate in day to day life.

      IME, people that have complex ideas to communicate and the available tools to draw frequently draw diagrams to communicate in day-to-day life.

      And, often, people that lack handy tools use gestures to convey visual representation that the viewers mind will translate into a visual picture.

    11. Re:visual GUI-based programming by santax · · Score: 1

      Yes, that is true, but mostly in situations to keep things simple. It's way smarter to have a sign with a picture in it than it is to having a saying: Hey there is a cliff here that you are about to drive off. We suggest to hit your brakes NOW. Because in some cases it is far easier to picture a simple overview of something. However, once you have to tell someone how to follow the complete procedure to achieve the pictures goal, you'll quickly find out that a common language is the main tool to have.

  82. Um ,what by Estanislao+Mart�nez · · Score: 1

    When it was first announced (5 years ago now?), I thought the Optimus Maximus [thinkgeek.com] keyboard was going to solve this problem. With a little smarts built into the keyboard I wouldn't mind esoteric key combinations if the result was displayed directly on the keyboard. Something like this might, someday, be the solution but at $1500 dollars it's going to be a while and assuming a direct-brain interface doesn't come first.

    Eh, they sell stickers you can stick on your keys.

  83. vim, svn, etc. can handle utf8 just fine ... by MadMaverick9 · · Score: 2, Insightful
    From TFA:

    And, yes, me too: I wrote this in vi(1), which is why the article does not have all the fancy Unicode glyphs in the first place.

    Excuse me - vim can handle utf-8 just fine. utf-8 file names and utf-8 content. on a vanilla slackware 13.1.
    http://www.cl.cam.ac.uk/~mgk25/unicode.html#apps [cam.ac.uk]
    # Vim (the popular clone of the classic vi editor) supports UTF-8 with wide characters and up to two combining characters starting from version 6.0.
    # Emacs has quite good basic UTF-8 support starting from version 21.3. Emacs 23 changed the internal encoding to UTF-8.
    And svn can handle utf-8 as well - http://svnbook.red-bean.com/en/1.4/svn.advanced.l10n.html [red-bean.com].

    The repository stores all paths, filenames, and log messages in Unicode, encoded as UTF-8.

    All it requires is ... set your locale and lang. "export LANG=en_DK.utf8" in "/etc/profile.d/lang.sh" (Slackware 13.1) and add some better fonts maybe.

    I apologize for repeating myself. I've written the same thing further down already in reply to another user's post. But I just read tfa and felt the need to reply to the author of tfa.

    1. Re:vim, svn, etc. can handle utf8 just fine ... by multipartmixed · · Score: 1

      vim is not vi and it is hardly a clone.

      Superset, perhaps, but definitely not a clone.

      --

      Do daemons dream of electric sleep()?
  84. Re:Microsoft Visual Studio allows Unicode identifi by icebraining · · Score: 1

    I agree. I use them in everyday writing, but never in programming!

  85. Jesus fuck by Anonymous Coward · · Score: 0

    This kind of thinking makes all the sense of the US trying to force countries to go back to the old standards of measure. While some in the US would be happy with that most of us can see how straight up stupid this is. For those who refuse to adapt to ASCII? Fuck 'em. We don't need them. Standards have made societies thrive for thousands of years.

  86. Mathematica! by Anonymous Coward · · Score: 0

    The core programming language is still mainly ASCII constrained. However, mathematical and logical expressions can be written in TeX-style, publishable format. Makes for easy to read functions and expressions.

  87. Lisp quickly becomes a DSL by SuperKendall · · Score: 1

    I like Lisp a lot (well, elisp anyway and scheme which is really where I've had a lot of exposure).

    But when you reduce typing, the problem is that you quickly develop a DSL - Domain Specific Language. That's great for you, as long as you really understand the domain well. But almost never is someone else's abstraction of a domain the same as your own so it's hell to maintain. And if you didn't understand the domain well you can end up with a DSL that is a poor way to express what needs to get done.

    Mainstream languages stay mainstream exactly because they impose a certain level of impediment to so easily expressing yourself, that others can get confused with what you meant to say...

    --
    "There is more worth loving than we have strength to love." - Brian Jay Stanley
  88. French and English are quite different by Pezbian · · Score: 2, Informative

    I worked for a Canada-based company and one of the magazines in the break room was Forces Quebec. It was something about packaging technology and had the articles written in both English and French, as is standard in Canada.

    The bilingual nature isn't what caught my eye, though. What caught my eye was the fact that the typeface for the French articles was just plain smaller in order to fit more text in a certain space. It looked to me like the same page real estate was dedicated to each language, but the typeface for the French text was set to a smaller point size with tight kerning and spacing.

    No wonder French people talk so fast. They have to!

    In fact, when I mentioned the same thing to one of my coworkers, a Mexico native, he wasn't surprised at all. He said the same is true for Spanish as well.

    When he told me that, I remembered Cheech Marin's "Born in East L.A." where he sings about being deported to Mexico despite being a US citizen "Next thing I know I'm in a foreign land. People talkin so fast I could not understand."

    --
    In a world of the blind, the one-eyed man is king--and the two-eyed man is a heretic.
    1. Re:French and English are quite different by Noughmad · · Score: 1

      That is true for most languages. English has a large number of words, among the largest among modern languages. I know because I had to write some reports in English and they were much shorter than the Slovene translation. I also hear the same from several professors at out university.

      On the other hand, you have German. They have so long words and even longer sentences, everything written in it is double the length of the English equivalent.

      --
      PlusFive Slashdot reader for Android. Can post comments.
    2. Re:French and English are quite different by ShakaUVM · · Score: 0, Troll

      >>On the other hand, you have German. They have so long words and even longer sentences

      What are you talking about? In German, you can have words that are entire paragraphs. =)

      For example, the Reichsdeputationshauptschluss of 1803 ended the Reichsunmittelbarkeit of the HRE.

      (And no, Firefox, those wards are not typos, just sentences.)

    3. Re:French and English are quite different by Pezbian · · Score: 1

      I hadn't even thought of German. Looking at the literal translations of Rammstein lyrics, you're right.

      --
      In a world of the blind, the one-eyed man is king--and the two-eyed man is a heretic.
    4. Re:French and English are quite different by LiquidMind · · Score: 1

      Why Troll? mod this funny.

      This coming from another native German speaker.

      Geschwindigkeitsbegrenzung = speed limit. Funny!

      --
      This sig contains repetition and redundancy.
    5. Re:French and English are quite different by ShakaUVM · · Score: 1

      Yeah, seriously. It's not a criticism of the German language, just something I found amusing when I studied it for a (brief) while in middle school.

      Even more amusing were the German magazines the teacher would let us read, which contained copious amounts of stuff not normally thought appropriate for middle school students in America. =)

    6. Re:French and English are quite different by gullevek · · Score: 1

      Yeah, but Rammstein lyrics, especially the older ones, had wonderful double meaning. I am not sure you can translate them very well.

      --
      "Freiheit ist immer auch die Freiheit des Andersdenkenden" - Rosa Luxemburg, 1871 - 1919
    7. Re:French and English are quite different by Pezbian · · Score: 1

      Like "hast" versus "hasst"

      --
      In a world of the blind, the one-eyed man is king--and the two-eyed man is a heretic.
  89. Have to do it by guyminuslife · · Score: 1

    You don't need a glyph for "=>" for instance. Anyone who knows what = and > mean individually can discern the meaning.

    => != >=

    --
    I don't believe in time. It's a grand conspiracy designed to sell watches.
    1. Re:Have to do it by MightyYar · · Score: 1

      You sound just like my compiler!

      --
      W..w..W - Willy Waterloo washes Warren Wiggins who is washing Waldo Woo.
    2. Re:Have to do it by leromarinvit · · Score: 1

      => != >=

      => != >= => >= != =>

      --
      Proud member of the Ferengi Socialist Party.
    3. Re:Have to do it by guyminuslife · · Score: 1

      *applause*

      I'll admit it:

      => != >= => >= != => >= => != >=

      --
      I don't believe in time. It's a grand conspiracy designed to sell watches.
    4. Re:Have to do it by badkarmadayaccount · · Score: 1

      sudo Slashdot.post.set(parent.Score)+=1

      --
      I know tobacco is bad for you, so I smoke weed with crack.
  90. very bad idea by t2t10 · · Score: 2, Insightful

    Using full Unicode for programming causes lots of problems; even string equality is a tricky proposition for Unicode, let alone precise parsing. Most people don't even know how to enter Unicode characters not found in their own language. And once you allow Unicode, people will do things like they did in APL.

    The only place Unicode should be allowed--if at all--is in comments. Everything else should be in ASCII.

    1. Re:very bad idea by MadMaverick9 · · Score: 1

      Only in comments? gee ... how about if I want to print Hello World in thai language?
      And guess what - I was gonna post some C code which does a printf of the thai language version of Hello World.
      But /. can not handle utf8 ... OMG.
      <title>Slashdot - News for nerds, stuff that matters</title>
      <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">

    2. Re:very bad idea by t2t10 · · Score: 1

      Only in comments? gee ... how about if I want to print Hello World in thai language?

      Well, of course Unicode strings should be supported. I was referring to program code, not data: keywords, identifiers, etc. Those should stay in ASCII in any programming language.

      (However, for professional development, you need to use internationalization support anyway.)

  91. Macbook Wheel sounds faster to me. by Pezbian · · Score: 1

    n/t

    --
    In a world of the blind, the one-eyed man is king--and the two-eyed man is a heretic.
  92. Mr. Kamp, you are an idiot by LynnwoodRooster · · Score: 1

    If you can write programs with just 8 characters, there is NO NEED to go beyond the base ASCII set.

    --
    Browsing at +1 - no ACs, I ignore their posts. So refreshing!
    1. Re:Mr. Kamp, you are an idiot by JesseMcDonald · · Score: 1

      Of course there is no need. So far as that goes you can write absolutely any program in just two symbols: 0 and 1. However, just as there is value in using ASCII characters and human-readable syntax in preference to opaque binary code, there may also be value in using more natural Unicode glyphs rather than forcing every idea to be expressed in the limited ASCII symbol set. Mathematicians certainly seem to think so, anyway—even in formal computer science, the use of ASCII tends to be more of an exception than a rule.

      --
      "The state is that great fiction by which everyone tries to live at the expense of everyone else." - Bastiat
  93. Coding in other langugages? by mr100percent · · Score: 1

    I've always wondered why nobody made compilers to write code in non-english languages. Are we ever going to see a Hindi version of BASIC?

    1. Re:Coding in other langugages? by garethw · · Score: 1

      I vaguely remember a meme - before we called 'em that - going around the net about how lots of Mexican immigrants were crossing the border and taking programming jobs, so they were coming with a new Spanish C language specification... por ( n = 0; n10; n++ ) { imprimaf( "Hola, el mondo!" ); }

      --
      garethw
  94. I wouldn't consider Mr. Pike an authority on by melted · · Score: 2, Funny

    I wouldn't consider Mr. Pike an authority on programming language design. At Google, he's known for designing Sawzall (described here: http://static.googleusercontent.com/externIal_content/untrusted_dlcp/research.google.com/en/us/archive/sawzall-sciprog.pdf) - a language that's so feature poor, esoteric, and ass-backwards, that Google engineers curse at length every time they have to use it. And use it they have, since it's darn near impossible, for various reasons, to do certain things without it. Try as I may, I don't see anything in Go that would make it better than half a dozen existing alternatives. It's like reinventing the bicycle again, but this time with square wheels and without the saddle. Yes, you guessed it right, that's where that pipe goes on this particular bicycle.

    1. Re:I wouldn't consider Mr. Pike an authority on by Bigjeff5 · · Score: 1

      And use it they have, since it's darn near impossible, for various reasons, to do certain things without it.

      In other words, it was designed with a specific purpose in mind, and it is the only practical tool for said purpose?

      Sounds like it's a good thing to me. Going to the dentist sucks, but that doesn't mean dentists suck at what they do, nor does it mean you should go see a bricklayer for your dental checkups.

      --
      Security is mostly a superstition... Avoiding danger is no safer in the long run than outright exposure. - Helen Keller
    2. Re:I wouldn't consider Mr. Pike an authority on by melted · · Score: 1

      In other words, it's not the best tool for this purpose, far from it. There are three or four different tools that are vastly better (and more performant, and use full-blown programming languages).

      Other tools just don't have permissions to access the files, and getting those permissions requires you to sacrifice your firstborn.

      It's like instead of going to the dentist, you'd have to go to a blacksmith, who'd pull your teeth out with rusty pair of pliers, because dentists are not allowed in your state.

  95. If ASCII was good enough for Jesus Christ by garethw · · Score: 2, Funny

    ... it should be good enough for anyone. Just sayin'...

    --
    garethw
  96. The problem with non ascii by Anonymous Coward · · Score: 0

    Is that a standard keyboard only has ascii, or at least not much else.
    and the human mind is unlikely to cope well by adding more characters.

    Using a small set of the ASCII characters is likely the only way to make a language that anyone can program in efficiently.

  97. irony by Anonymous Coward · · Score: 0

    You guys realize that Rob invented unicode, don't you?

    1. Re:irony by Anonymous Coward · · Score: 0

      Not precisely. He was a co-creator of the UTF-8 variable-length character encoding for Unicode.

  98. Another Issue... by MacGyver2210 · · Score: 1

    This is how I became so completely wapuro-baka...

    For reference: "Wapuro Baka" is a phrase that means 'Word Processor Stupid'. This describes someone who can write in Japanese on a computer by typing the Romaji (English letter) sound-only syllables, but cannot write the harder meaning words by hand because they do not know the symbols in detail.

    --
    If the only way you can accept an assertion is by faith, then you are conceding that it can't be taken on its own merits
  99. Support is needed ... by PPH · · Score: 1

    ... for either Han or Sanskrit characters in programming languages.

    The way the s/w development market is going, ASCII support for Latin character sets is becoming pointless.

    --
    Have gnu, will travel.
  100. pros? by Charliemopps · · Score: 2, Insightful

    Ok, so everyone agrees this is a stupid idea... but are there ANY pros? I just don't understand the premiss at all...

    1. Re:pros? by aaronrp · · Score: 1

      The advantage of course is lack of ambiguity. Is this particular asterisk used to mark a comment, multiplication, repetition of strings, or exponentiation?
      I tend to agree that too many symbols would be confusing, but if done judiciously, I think it makes a certain amount of sense. For example, I find irritating perl's use of the brace for three separate purposes ( hash keys, anonymous hash construction, and code blocks). Separate symbols would be nice.
      Of course, you could do the same thing with multiple-character symbols (maybe [[ and ]] could delimit hash keys and anonymous hashes). Either way, it's two keystrokes.

    2. Re:pros? by Charliemopps · · Score: 1

      But ASCII has plenty of extra special characters that are never used. I think that all that would be needed is a new keyboard to make use of them or a UI that made it easy to select them.

  101. C++ Support for Unicode by Anonymous Coward · · Score: 0

    C++ already supports Unicode for identifiers and in comments. What more do you expect? If the operators and keywords were supposed to support Unicode wouldn't the the programming language be encumbered with many different translations for each code set (or language)? I think identifier support is plenty. Let's not make it harder to develop and maintain correct apps, it's already hard enough.

  102. Expanding the common set by ChrisMaple · · Score: 1

    I can see value in characters that improve readability, or that appear often enough that their absence is a nuisance.

    Left-arrow for assignment, so that the equal sign can be reserved for comparisons.

    Single characters for .NE. .LE. .GE.

    Floor and ceiling symbols

    A "degree" symbol

    An upward arrow for exponentiation, so that caret always means xor.

    Something new to make the declaration and use of pointers clearer, the way C does it is just too confusing.

    But these are just my pet peeves; I'd be surprised to see many people agreeing with me.

    --
    Contribute to civilization: ari.aynrand.org/donate
    1. Re:Expanding the common set by Tacvek · · Score: 1

      Quite a bit of the confusion could be fixed for existing languages Simply by improving the IDE.

      Imagine if in an IDE, you are writing C code. When you press the equals key, you get a left arrow symbol. The actual source code still has an equal, but it is displayed as a left arrow. Only when you type the second equals does it show up as an equal sign.

      Similarly ! might be displayed as the logical not symbol, while != gets shown as the classic not equal symbol. The carrot would show up as a plus in a circle (A common symbol for XOR), etc.

      The -> operator would be displayed as a one character right arrow, of course.

      But this would all just be display tricks. If after typing a double equals, you hit backspace, it demotes back into a left arrow. If you move the cursor into the middle of a compound character it breaks down into its component parts, until the cursor is moved out. Etc.

      That would work well, without having anything be hard to type, since the actual source is ASCII.

      Make the * character in C show up as completely different characters in C depending on the function it is performing. (i.e. for multiplication, replace it with a slightly over-sized multiplication dot, for pointer declaring, some other symbol, same for de-referencing.)

      --
      Stylish sheet to fix many problems in Slashdot's D3: https://gist.github.com/801524
  103. It is by definition not a natural language by Sycraft-fu · · Score: 1

    A natural language isn't one spoken by humans, it is one that came about naturally. It grew up from usage by humans, the rules formed from long convention, and one that is living, changing. If you invent a language for a special purpose, be it computer programming or clear communication, it isn't a natural language. Also you'll notice that nobody is going around speak Lojban. It hasn't taken off, at all, in society.

    For that matter even if it did, it probably wouldn't work as a programming language. Programming must be unambiguous in a way that is just hard for most humans to understand. Everything must be precise, everything must be spelled out (and done so correctly). Trying to construct a spoken language like that would be a waste because people would never want to use it. Humans can get multiple levels of meaning, analogies, metaphors, and so on that computers can't handle. Useful in human communication though.

  104. Didn't we try this once already? by WDubois · · Score: 1

    It was called APL.

  105. Programmers talk to Machines, not people by bug1 · · Score: 1

    This guy is typical of the modern generation of programmers who think software should be made for _people_ to read !

    Pity the dude trying to write a compiler to interpret his (or worse, my) pictures...

    Software is primarily made for computers to read, it is the job of a programmer to translate real world problems into the machine world.

    Damn idiot money men think that using high level languages so idiots can write software is going to lead somewhere other than lots of idiotic software...

    GET OFF MY LAWN !!!

  106. Obviously Poul-Henning Kamp never used an ASR-33 by AndroidCat · · Score: 1

    OR HE MIGHT HAVE NOTICED THE LACK OF A CHUNK OF THE ASCII TABLE. (No, it's not like yelling, it's like an ASR-33 you insensitive clod! *sigh*)

    --
    One line blog. I hear that they're called Twitters now.
  107. All for Unicode source code, not syntax by mgiuca · · Score: 1

    I'm all for languages which allow Unicode characters in their source (Unicode strings, Unicode comments). That simply makes it easier for foreign developers and foreign language strings. Luckily, most modern languages (including Go) do allow this.

    But Unicode syntax is a nightmare to type. It should be perfectly possible for me to type an entire program using only the symbols I see on my standard US keyboard.

  108. The trouble with huge character sets. by Animats · · Score: 2, Interesting

    This has come up in the context of domain names, where a long, painful set of rules has been devised to try to prevent having two domain names which look similar but are different to DNS. If exact equality of text matters, it's helpful to have a limited character set for identifiers.

    There's currently a debate underway on Wikipedia over whether user names with unusual characters should be allowed. This isn't a language question; the issue is willful obfuscation by users who choose names with hard-to-type characters.

    As for having more operators, it's probably not worth it. It's been tried; both MIT and Stanford had, at one time, custom character sets, with most of the standard mathematical operators on the keys. This never caught on. In fact, operator overloading is usually a lose. Python ran into this. "+" was overloaded for concatenation. Then somebody decided that "*" should be overloaded, so that "a" + "a" was equivalent to 2*"a". The result is thus "aa". This leads to results like 2*"10" being "1010". The big mistake was defining a mixed-mode overload.

    In C++, mixed-mode overloads are fully supported by the template system and a nightmare when reading code.

    In Mathematica, the standard representation for math uses long names for functions, completely avoiding the macho terseness the math community has historically embraced.

  109. They built the bomb with less by RightwingNutjob · · Score: 1

    Key calculations for the design of the first implosion-type atomic bomb, which involved solving nonlinear three-dimensional differential equations to make sure the little booms that caused the big boom reached the core at the same time were solved by punching octal code into paper tape and running it on a mechanical computer.

  110. back in the old days by Anonymous Coward · · Score: 0

    we only used 1's and 0's and we got along fine.

  111. IDEs with less ascii by boxxxie · · Score: 1

    when i use inkscape (SVG program) and i want to pick a color for my square, so my square looks nice and pretty, i have 4 color choosing tools to use, 3 that let me select the color based on some color rules and sliding bars, and a color picker that lets me select a color that i've used on some other part of my drawing. i could also manually enter a symbol/value representing the color on 2 or 3 of the color tools. the document saves the color information in XML/SVG format, and i can find it with the raw-xml editor in inkscape and change the color value from there. i can also write my own SVG/XML in the xml tool in inkscape.

    SVG is a language for making graphics, almost a DSL (it's in XML, so it can't be a pure DSL). the SVG language is pretty complex and is pretty hard to write by hand. it's a bit hard to read and edit without a good xml editor.

    anyway, an individual color is a value, it is represented by multiple values, for anything other than simple colors you use a tool to pick it. this is idiomatic, and has been for many years. sometimes this way of coding works very well. the best example i can think of is sikuli. in sikuli you program by taking screenshots and your program is like a state-machine where the input is detecting images that the programmer has extracted from their screenshots. the IDE shows thumbnails of the screenshots and text. it would be less effective to replace the thumbs with the pathnames of the image files they represent (which is how they are in the code files).

    i think that most people's arguments are about separation of the programming language and the IDE, but i think there is something good to a language that has interfacing with an IDE, or general interfacing, in mind. I think there are benefits for having the language be a bit less human readable in order for it to better interface with things, like an IDE. examples: javadoc, java annotations. these are hacks, make code less readable, but are there just for other programs to use.

    so, do we blame IDE makers for not giving us IDEs that help us to understand our code in the best way possible, or do we blame the language makers who don't include rich meta-code constructs in their languages?

    why the hell do i have to write in HTML to get separated paragraphs for posting on this form?

  112. Eastwest is ascii free by palomer · · Score: 1

    I once wrote a programming language which was syntax free. For example, here is a program which calculates square roots using newton's method, written in japanese:

    http://www.youtube.com/watch?v=vwgvVpCRecE

    The types and function names are in Japanese, The variables are in english, but this needn't be the case.
    For those of you who want to see more, this video shows me writing a calculator application:

    http://www.youtube.com/watch?v=SSZBc2ohR2o

    For more information, please see the following page:

    https://sites.google.com/site/rathereasy/eastwest

  113. My code doesn't work. by slimjim8094 · · Score: 1

    #include <stdio.h>

    int main(char** argc, int argv) {
            int x=–5;
            printf("%d\n", x);
    }

    The errors are fairly obvious if you compile, but it's not easy to see. Now you could tell me that C isn't designed to be written in Unicode, and you'd be right, but at least it's pretty clear which characters are wrong. A language designed for unicode would be even worse, since the characters wouldn't be illegal outright, and it might try to convert emdashes to - for a subtraction, etc.

    Bad idea. Code is terse by design. Ever noticed how much harder it is to say precisely what you mean in, say, Applescript? Adding a character set of 100k is a terrible idea.

    --
    I have developed a truly marvelous proof of this comment, which this signature is too narrow to contain.
  114. Oh, cruel irony of slashcode by Scrameustache · · Score: 1

    You don't need a glyph for "=>" for instance. Anyone who knows what = and > mean individually can discern the meaning.

    If slashdot didn't eat it, you could be seeing &rArr; displaying that glyph here... ""

    --

    You can't take the sky from me...

  115. OMG I've seen less vitrol discussing, well anyth'n by Anonymous Coward · · Score: 0

    Do you people even read before posting, Someone dares to posit that the emperor has no clothes and everyone begins throwing stones. The core of his statement is that: "Programming languages can be more information dense by not being confined to ASCII" And what does the community at large do? Begins stuffing an effigy and mounting it on his front lawn. I've got one for you; how about a helio centric solar system? Or a system of matter based on discrete atoms? Or how about a biological system based on cells with inheritance? Burn the witch! Burn it!

    Still with me? How about a little use of the forebrain and a little less of the midbrain and moltov coctails?

    Could a programming language be less visually and conceptually obtuse if the information density per character is increased? The answer is a so obviously yes that any naysayer must be racist. Yes, I said racist. Look at your keyboard, now look at a globe. Know what? Over 90 percent of that globe doesn't use that character set. Doesn't matter who you are. Look I even used percent instead of 0/0 because I don't know how to enter that character.

    *sigh* Yah know, sometime it will be time to kill the golden calf and move on.

  116. Eloquent nonsense by SpaghettiPattern · · Score: 1

    Poul-Henning Kamp seems to know his languages. Good for him.

    But I take he's never worked in a financial institution where application programmers have to produce working code. Cursed are the idiots that write programs using glyphs outside the intersection of ASCII and EBCDIC sets. IMHO it is a sin to even put "fancy" characters even in comment.

    Ever tried outsourcing code support to Hyderabad? How do you fancy your chances of being efficient or effective without limiting yourself to English in ASCII? Would you be pleased to see method and variable names in Urdu?

    The god given language for programming is English, encoded in ASCII or in EBCDIC. Programming languages needn't support any other spoken language. To further emphasise this point, I speak as a non-native English speaker, not living in an English speaking country.

    --

    I hadn't the slightest objection to his spending his time planning massacres for the bourgeoisie... (P.G. Wodehouse)
  117. BASIC translations by Compaqt · · Score: 1

    I don't think it should be that difficult to translate BASIC, just as a teaching tool for non-English societies.

    Say, takes something like http://www.freebasic.net/ , and change the string constants for FOR, WHILE, PRINT, etc. to something in your own language.

    MS used to have localized Office Basic.

    --
    I'm not a lawyer, but I play one on the Internet. Blog
    1. Re:BASIC translations by paedobear · · Score: 1

      Yes, and they dropped it, FAST, because everyone in markets that got localised VBA hated it

    2. Re:BASIC translations by arth1 · · Score: 2, Insightful

      One problem is that word-for-word translations don't work. Other languages have both cases and genders applied to words, and often a different sentence structure too. Should "LET A=10" become "A=10 LASSEN"? What about Russian where the gender is significant? Or Japanese, where the status between speaker and listener determines the word? And what about right-to-left languages? Or top-to-bottom ones? But the biggest problems are, of course, compatibility and maintainability. You can't hire consultants who don't speak the language. And what if you branch out from Iceland to Sweden? Will you hire Swedes who speak Icelandic, or port all your apps to Swedish and maintain two different versions and prohibit unported e-mail attachments? Ask yourself why Microsoft doesn't have localized Office Basic anymore.

    3. Re:BASIC translations by Compaqt · · Score: 1

      That's why English is probably best for the base keywords (and even the comments and variables):
      -No "weird" characters
      -Can be expressed in ASCII7
      -Flexible grammar

      But translating KTurtle to Swahili for general audiences is probably OK (as opposed to serious, advanced classes for programmers).

      --
      I'm not a lawyer, but I play one on the Internet. Blog
  118. going the wrong way by bugs2squash · · Score: 1

    How about coming up with a new alphabet with just 10 characters, or fewer. That would make my keyboard smaller. Just so long as the letters U, K, C and F are included, I'll be able to express myself with my usual panache and I might even be able to find the keys on my phone.

    --
    Nullius in verba
  119. Unicode is SuperASCII by mrgren · · Score: 1

    ASCII will be with us forever, because it's enshrined in the bottom 7 bits of Unicode! U+0000 through U+007F are ASCII . The most common/useful encoding of Unicode is UTF8, which is backwards compatible with ASCII. Your ASCII data is Unicode, in UTF8 guise.

  120. Don't take everything so seriously by glassware · · Score: 3, Insightful

    I'm truly saddened to see so many people took this article summary so literally. If you read TFA, it's actually a very bright, intelligent, humorous example of programming insight. I found it a very delightful read and I wholeheartedly felt that the article presented its thoughts lightheartedly and without expectation of seriousness. To hear all the commenters here, it's as if the article ran puppies over with a steamroller.

    Please guys - I'm all for silly commentary. But read the article if you're going to pretend to write something clever. It's thoroughly tongue-in-cheek.

  121. Algol 68 by andyh-rayleigh · · Score: 1

    Or even Algol 68

  122. I hope this is a troll by Anonymous Coward · · Score: 0

    If not, no way.
    I Don't like math expressions .. all those underscore numbers meaning something weird and a new symbol for a new operation.
    In a sense math vs program language syntax (yes i know there is some exceptions) is the same than for example written Finnish vs written Chinese.
    We have a new mathematical operation, what to do? Invent a new symbol? No Fucking Way! NO!

  123. Just use assembly by Anonymous Coward · · Score: 1, Funny

    You can assemble it to binary and then diassemble it to any mnemonic set you like.

  124. Tree-style editors? by yacwroy · · Score: 1

    There's no reason to be locked to an array of arrays of characters as the only code format. Program code is inherently tree-structured.

    Recent Blizzard editors (WC3, SC2), they have a tree-based system for their code. I'm not sure if there are many other examples, but IMO this is the way forward.
    Pluses:
    - No compilation errors.
    - Faster compilation since parsing is already done.
    - No typos.
    - Perfect code-completion / syntax-highlighting.
    - No arguments about style guidelines because there's nothing but content. No whitespace etc. Style can be set from your end in your editor.
    - Changing all copies of an identifier name is instant and flawless.
    - Smaller files.
    Downsides:
    - Can't use regular text editors.
    - Difficult to work backwards (can't write a variable before its declared).
    - Catch 22 when there's no editors that deal with the format. Also, format wars.
    - Included files need to be read by editor.

    There'll be more pluses and minuses I'm sure.

    There's nothing stopping anyone from making a C++ (or C++-esque) frontend like this.
    There's nothing stopping these editors from being as quick to use as text editors. In fact, they should be way faster.

    --
    You agree with me.
  125. What?! by CobaltBlueDW · · Score: 1

    Unicode programming?! I get pissed when programming languages include shift-accessed characters in their standard syntax. Like PHP using '->' instead of '.' . Unicode programming sounds about as irrational as the natural language programming ideals of COBOL.

    slashdot.binspam.add(this);

    Quick, concise, logical, objective... Too bad parenthesis and curly brackets are shifted...

  126. Missing the point. by Anonymous Coward · · Score: 0

    The entire point of a programming language is to write something in a language sufficiently native to both the programmer and the compiler. Programming languages use the glyph set many humans are familiar with in order to provide a rigid framework in which a compiler can write machine language to lead the machine into performing the desired task(s).

    If a larger character set is part of your native language, great. Use some version of your favorite programming language that gets along well with what you're trying to say. Really, this only applies to variable names as commands and other reserved words are quite finite and well defined.

    Expanding the number of glyphs used to represent a command is a solution to a problem that doesn't exist. Do I really need to learn an otherwise meaningless language to express the command "cout" as a single character instead of a combination of 4 glyphs? Once you reach a certain number of glyphs (a couple hundred if I remember correctly, depending on your end instruction set / hardware arch) the advantage of using a compiler becomes more of an optimization question than a translation question. If some future generation generation is going to be forced to learn a glyph set with higher count than the machine itself, than why the hell should we have anything but machine language with code optimizers instead of compilers?

  127. Author of article is idiot - read the damn spec by gwappo · · Score: 1

    from the article:

    Unicode has the entire gamut of Greek letters, mathematical and technical symbols, brackets, brockets, sprockets, and weird and wonderful glyphs such as "Dentistry symbol light down and horizontal with wave" (0x23c7). Why do we still have to name variables OmegaZero when our computers now know how to render 0x03a9+0x2080 properly?

    uh yeah, Go allows you to use the full unicode just fine; on identifiers and everything - people should read the spec before making sensationalist comments that waste everyones time.

  128. it's ironic by toby · · Score: 1

    That the linked ASCII chart is SVG... which can render Unicode and is encoded in an explicitly Unicode medium...

    No - I think we're already in the future, sorry PKH.

    --
    you had me at #!
  129. Where's Marshall McLuhan by toby · · Score: 1

    When you need him.

    And somebody else said something about 26 soldiers of lead conquering the world but the interwebs can't seem to decide who, or if. That's progress!

    --
    you had me at #!
  130. Space Cadet Keyboard by Anonymous Coward · · Score: 0

    Oh YES YES Mr. Poul-Henning Kamp can you buy me a space cadet keyboard where I have to press Alt-Shift-Ctrl-Meta-Cokebottle to get a Q?

  131. yeah right.... by SuperDre · · Score: 0

    He should get his head examined... If he says what he says he certainly isn't a programmer.. If you need more characters as what's possible with ASCII for creating your code you should really get back to basic or get out of the business altogether.. there is no need for even more characters to create your code..

  132. Not really by Sycraft-fu · · Score: 1

    For one, you have to be good at handwriting. Many people, like me, aren't. Computers do not have an easy time with character recognition. They are much better these days, but they still lack a good bit behind humans. So whatever trouble a person has with recognizing your writing, a computer will have more. Next, this is even more problematic when you have a language with lots of glyphs because so many are very similar. It can be real difficult for them to tell the difference, and many of the tricks they use that have made them better won't work for those kind of languages. Then there's the fact that you have to learn a new writing skill, looking at a screen that you aren't writing on. Difficult for people to do, to not look at the hand you are writing with. Decreases your penmanship further. Finally there's the fact that it is much slower. Even a fast scribe has nothing on a normal typist.

    The current transliteration solution we have works well, and that is why it continues to be used. Has the other advantage that the typing skills you learn apply just as well to Western languages, you don't have to learn a new set of skills.

    1. Re:Not really by AlecC · · Score: 1

      Agreed. After 40 years of using computers, I have almost lost the ability to handwrite, I simply very rarely have the need to do more than jot down a few words or, more often, number/character strings (train times, phone numbers etc). My longhand is barely legible to me, and degrades very quickly. After about 20 words, it is illegible. I had to hand-write an envelope yesterday (printer cartridge ran out between letter and second copy of envelope after I inserted envelope in printer wrong). It was hard, took a while, and was ugly though (I hope) functional. (The postcode was certainly legible). I would need to embark on a serious stint of retraining to get back handwriting comprehensible to humans, let alone computers.

      --
      Consciousness is an illusion caused by an excess of self consciousness.
  133. Re:The thing with ASCII [COBOL 2.0?] by hcdejong · · Score: 1

    What about novice or occasional programmers? I've done a few things in Python over the years, and I've found that even with this fairly simple language, my knowledge of the syntax etc. leaks away with disuse. I recently had to write a program after about 1 year of not using Python, and I spent half the time relearning the language. Terse, non-English languages like C have a higher barrier to entry than Python because of this.

  134. english has 42 sounds not 26 by Anonymous Coward · · Score: 0

    and 1400 different ways of spelling them . If there were 42 characters , it would vastly simplify english spelling .

  135. Bob Bemer birthed backslash by Anonymous Coward · · Score: 1, Informative

    Wikipedia claims that ASCII grew the backslash [\] specifically to support ALGOL's /\ and \/ Boolean operators. No source is provided for the claim. ftfa

    Here's one of the two sources that Wikipedia cites, straight from the inventor of the backslash: HOW ASCII GOT ITS BACKSLASH citing his book [ R.W.Bemer, "A view of the history of the ISO character code", Honeywell Computer J. 6, No. 4, 274-286, 1972 ]

    "I had called a joint meeting of IBM, SHARE, and GUIDE, to regularize the IBM 6-bit set to become the standard BCD Interchange Code [76]. Frequency studies of symbol occurrence had been prepared, particularly from ALGOL programs. The meeting of 1961 July 6 produced general agreement on a basic 60-64-character set, which included the two square brackets and the reverse slant, which was chosen in conjunction with "/" to yield 2-character representations for the AND and OR of early ALGOL. This is reflected in the set I proposed to ANSI X3.2 on 1961 September 18."

              (Note: I had put the backslash in position 5/15. It enabled the ALGOL "and" to be "/\" and the "or" to be "\/".)

    Apparently he also invented ten other ASCII codepoints (called himself the father of ASCII), timesharing, escape sequences, the Y2K bug, word processors... and COBOL.

  136. oh boy by XCondE · · Score: 1

    The Norges are sure fond of their umlauts.

  137. Funny and sad... by CondeZer0 · · Score: 1

    Funny given Rob Pike's involvement with the creation of UTF-8.

    Sad that as has has become common, everyone and their dog want their pet feature in Go, totally missing the point of the language which is: a small and very carefully selected set of features that work well together and don't interfere with each other in unexpected ways.

    Sad also that ken's involvement in the creation of both UTF-8 and Go goes unmentioned.

    In any case, is there people out there have forgotten what a huge pain it was to program in APL?

    There are reason why modern successors of APL, like K (which by the way is a super cool language) stick to ASCII: you can actually write code without going insane!

    --
    "When in doubt, use brute force." Ken Thompson
  138. There are 10 kinds of people in the world... by xenobyte · · Score: 1

    Seriously, it's all about 1's and 0's. Does it really matter what language and syntax abstraction you use to enter them?

    If it ain't broken, why fix it? - And is ASCII broken? - No, it works just fine for expressing the programming languages we've used so far, and as far as I know they works just fine in solving all our algorithmic needs.

    --
    "For every complex problem, there is a solution that is simple, neat, and wrong." -- H.L. Mencken (1880-1956) --
  139. Das ist ... by Anonymous Coward · · Score: 0

    Except for the Germans. I don't think their language uses spaces.

    ...richtig, wir Deutschen benutzen Tabs anstatt Spaces. ;)

    Translates to: "That's right, Germans use tabs instead of spaces."

  140. ASCII Wall what ASCII wall ? by Anonymous Coward · · Score: 0

    ASCII Wall what ASCII wall ? This ASCII is simple way to encode messages. What are the alternatives ? 32Bit Unicode ? No, surely not, thing about all the characters that look equal and are not. That will result in a confusion exploidable by spammer and skimmer. There is a reason why programming languages are resticted to ascii because it represents a workable set. Going beyond will mean nobody (except computers) will ever read the programms. Than you can go directly to binary. In short ascii solves more probles than it creats. it is a universal standard theses days, do not make the worls more complicate it already is.

  141. Unicode in C, C++ and Perl by rl117 · · Score: 2, Informative

    One thing many people aren't aware of is that for several years now (since GCC3), GCC and G++ accept UTF-8 as their default input encoding, and internally store narrow and wide strings as UTF-8 and UTF-32, respectively. It's recoded to the output stream locale when you do any output. This means you can write your source code in Unicode (in strings and comments at least) and it all works perfectly. It has full support in the C and C++ standard libraries. I've been using it for years; it works perfectly. It would be nice to get support for UTF-8 symbols in the linker, so we can have UTF-8 variable names as well. The same applies to Perl, though perl6 even gives you the ability to have Unicode operators, and possibly variable names.

    I do routinely use UTF-8 symbols in R (example: "deltaCt" can be replace with the actual Delta symbol [Slashdot ate the Unicode--seriously poor!]). It makes the code more readable, and entry isn't the massive issue people make it out to be. AltGr/compose keys handle the common symbols, and you can look up the few odd ones that aren't in the compose tables.

    Having the ability to use Unicode does not in any way detract from the ability to use ASCII. Since ASCII is a strict Unicode subset, the ability to use Unicode imposes zero overhead on those who wish to stick with ASCII, so the extent of the hate seen for wanting a bit of progress is a bit shocking. People pointed out how unreadable code could be made, but the reality is that when used sensibly and judiciously, it can make code more concise and readable.

    http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=522776 for information about some of the issues.
    Having native Unicode support end-to-end by default is still a goal we want to achieve; the ASCII C locale is the last holdout. Getting a UTF-8 C locale is the last remaining step, though it'll take a few years to get there.

    Regarding editing Unicode sources, both Emacs and vim have pretty decent Unicode support, and Linux distributions have had unicode support for a decade now, and really good support for at least six years. Broken tools are no longer an excuse for not using Unicode.

    Regards,
    Roger

  142. Bullshit! by bickerdyke · · Score: 1

    Sorry, this idea is utter bullshit.I'd like to see that guys face when he tries to include a library by some european programmer hand he doesn't even have the keys on his keyboard to spell out the function names!

    Here, have some Umlauts: äüöß
    Just in case you need to copy and paste them to include fahrvergnügen.h

    --
    bickerdyke
  143. Your ignorance by KGBear · · Score: 1

    should have no bearing on my ability to expresse myself. If you are unable to make your intentions clear within the language system that makes up most of what our species reads and writes, it's you own fault, not that of the language. Have you considered that you're just not telented?

  144. Try Python by mangu · · Score: 1

    why, oh tell me why, when I write a simple - trivial - bit of Java code, do I need to write functions for getters and setters all over the place - dammit, just declare them as gettable and settable

    Python is exactly like that.

  145. Heiroglyphics by Anonymous Coward · · Score: 0

    There is a reason that elegant alphabets won out over heiroglyphics. I'm always amused at how many people wish to take us back.

    "That's a simple one.

    Bird. Man with spear. Sideways fish. Beetle. Vase.

    It means, and this is just a rough translation,
    'A man with a spear trapped a bird and a sideways fish in a vase.'

    And there was also a beetle.

    That's just one possible translation."
    -- Teddy Roosevelt

  146. Thank you. by reiisi · · Score: 1

    Of course, the typical denizen of slashdot (such as myself) is of two minds on problems like this.

    The geek recognizes the simplicity of keeping the existing solution, but also recognizes the inherent attraction of the difficult problem of going with the technilogical solution.

    (Did I just say that?)

    --
    Computer memory is just fancy paper, CPUs just fancy pens with fancy erasers; the 'net is just a fancy backyard fence.
  147. Slashdot resolving non-problem .... by ceplinboston · · Score: 1
    I am not new here, so I am not surprised. Or it always surprise me mildly how many people so much time dealing with complete non-issue, while related burning issues are ignored. So, how we are doing with Unicode support genearlly: PHP:

    A new major version has been under development alongside PHP 5 for several years. This version was originally planned to be released as PHP 6 as a result of its significant changes, which included plans for full Unicode support. However, Unicode support took developers much longer to implement than originally thought, an the decision was made in March 2010[13] to move the project to a branch, with features still under development moved to a trunk.

    PHP currently does not have native support for Unicode or multibyte strings; Unicode support is under development for a future version of PHP and will allow strings as well as class, method, and function names to contain non-ASCII characters.

    Ruby:

    Initial support for Unicode and multiple character encodings (still buggy as of version 1.9)

    Zarafa (that's my favorite pet-peeve)

    Internally, Zarafa uses the windows-1252 charset just about everywhere. This means that we're storing the entire subject, to, from, etc in windows-1252. Only at the moment that the message is converted to an outgoing RTF822 message for SMTP, is the charset conversion done to follow various RFC822 standards.

    I have my own bug for this on the Red Hat Bugzilla, which made it blocker for me, but I wonder how somebody could write in the 21st century a groupware server which is capable of working only with windows-1251 charset.

  148. Re:The thing with ASCII [COBOL 2.0?] by Anonymous Coward · · Score: 0

    This proposal isn't about giving programmers more power to code, it's about making it easier for non-english speakers who aren't coders to read the code that their programmers write.

    Perhaps it's comparable to legalese. Making it proper English doesn't necessarily improve readability by non-lawyers. It's still gibberish to most of us without a legal background.

    It's not worth-while to slow down production programmers in a trade for the rare case where non-programmers will want to read code for an actual need (not just curiosity). Thus, it's an uneconomical requirement as long as there is such a trade-off.

    Agreed. There's these things called "documentation" and "specification" to communicate programmatic ideas where actual code cannot be applied: useful for both manager-types and new coders on a project.

  149. So does JavaScript by multipartmixed · · Score: 1

    Whoopee

    ECMAScript has been composed of Unicode characters since at least ECMA-262-3. The first version of JavaScript was UCS-2, so any version can use the basic multilingual plane. And even IE-6 runs edition 3.

    --

    Do daemons dream of electric sleep()?
  150. Bah by Anonymous Coward · · Score: 0

    Unicode has been the new black for most of the past decade

    You mean it stinks and nobody likes it?

  151. more tedious, yes by reiisi · · Score: 1

    One example, using Japanese and the typical Romaji (latin) input method filters for the word "apple".

    English, well, there it is: "apple". four different keys, one repeated stroke, and space or punctuation to delimit -- six strokes, and, with practice, you don't have to look at either the keyboard or the screen.

    The Japanese word Latinizes to "ringo", and that is what you usually type when using the IME in Romaji mode. But (problem 1) there is no standard small set of delimiters. So you reach for a conversion (henkan) key, which is usually either the space bar or a small key next to a shrunken space bar.

    Some methods have several henkan keys, depending on whether you want to just dump the conversion buffer (probably hiragana) or force to the other kana (probably to katakana) or get a list of candidate Kanji (Han characters). More often, there is only one (effective) conversion key that pops up a list of candidates, and the method assumes, for each candidate vocabulary element, a preferred conversion which it puts at the top of the list.

    So most people will end up having to check the list of candidates and make sure the desired one is selected. If not, the conversion key is repeated to select the next in the list. (Cursor keys can be used to scroll through the list in many IMEs.)

    Oh, and, incidentally, you need to be able to decide for yourself whether the current reference to "apple" is best represented with hiragana (for usual words in modern times), katakana (for foreign words, or emphasis, or to indicate that there is something special -- foreign? -- about this apple), or Kanji (in many cases, you may have an option on which Kanji). Or, maybe you actually want to use Romaji for some reason, although, in that case, you might have typed "appuru" instead). Options, options, options.

    Does this sound like something you're going to have easy time of touch typing?

    It gets worse. In order to improve efficiency, the method "learns" from the user and reorders the candidate list for you.

    Professional data entry operators have professional input methods that do allow touch typing, but they take a lot of learning and training.

    Oh, there is kana mode for the keyboard, where the 46 (erm, 50 plus or minus) kana are layed out on the qwerty keyboard, on the right-hand side of the keys (I ought to put a link to something here, but I won't.) If you noticed that we have just laid kana out where there are numbers, that is correct. I am trying to learn to touch type the kana keyboard, but there are several versions with minor variations between them, and, really, the prevailing common sense is to use Romaji mode.

    In kana mode, "ri-n-go" is three keys, but you have to hit a modifier key after the "ko" to voice it (to "go"), so that's four keystrokes. Even if I do learn to touch-type in kana mode, it is not that much more efficient. I'm just being a typing geek trying to learn to do that.

    The efficiencies are eseentially leveled by the conversion step.

    Chinese is getting a phonetic conversion, but, historically, a stroke-radical input method has been preferred. Kanji (Hanji or Xanji when talking about Chinese) are constructed of moderately regular parts (radicals), but there are around 300 of those. That list of 300 is broken down for the keyboard. (Not unreasonable, most of the radicals are composites of simpler stroke sets.)

    But you still end up de-parsing at the character and word level, where in Latin you mostly de-parse at the word level.

    I'm not sure about China, but the typical Japanese attitude towards computer keyboards is that they would rather write and edit on paper and then type stuff in when they've got it fixed so that they can minimize typing. It takes considerable experience to get past the perceived inconveniences.

    One of the reasons the English context worked well in developing computers was the paucity of characters. A small set of glyphs is generally an advantage, even at the cost of overloading the punctuation and such.

    --
    Computer memory is just fancy paper, CPUs just fancy pens with fancy erasers; the 'net is just a fancy backyard fence.
  152. I beg to differ. by reiisi · · Score: 1

    That extra layer of parsing makes it much more difficult to touch type while looking at a source document, especially with the stock (ahem, Microsoft) IME.

    --
    Computer memory is just fancy paper, CPUs just fancy pens with fancy erasers; the 'net is just a fancy backyard fence.
  153. Kanji and Xanji the same, right, ... by reiisi · · Score: 1

    That's what the Unicode consortium wants you to believe.

    (Yeah, I beg to differ. I work with this stuff. There are issues in ideographs that the Unicode Consortium is still either ignoring or not aware of.)

    --
    Computer memory is just fancy paper, CPUs just fancy pens with fancy erasers; the 'net is just a fancy backyard fence.
  154. Not really. by luis_a_espinal · · Score: 1

    Japanese is typed using a more-or-less standard QWERTY keyboard.

    Tediously.

    I've seen my wife going at it (and Japanese people in all types of computerized businesses on Japan). They certainly don't have much of a problem. Kanji writing is a unique and complex task not easily amenable for typing on a keyboard. Hiragana and katagana are much more amenable and most Japanese know to write with "Romanji". The software they use simply finds the appropriate Kanji, Hiragana and Katagana after they type the corresponding Romanji.

    So in essence, they are typing just as we do using a Roman alphabet with the software doing the translation automatically just as modern word processors automatically correct misspells for you (and in both cases, the software gets it right most of the time.)

  155. APL all over again by luis_a_espinal · · Score: 1
    Even if now it is with Unicode as opposed to some crazy keyboard combination using non-standard keyboards, this path will only lead us to APL all over again. No thank you.

    This is more like a badly thought of solution looking for a problem no one practical wants to even touch with a 10-foot pole.

  156. Hardly - more like the different JVM languages by ZmeiGorynych · · Score: 2, Insightful

    I really, really don't think so. Different tools for different jobs - a language for writing reliable infrastructure should look very very different from a language for exploration of datasets, for example - the first one must place emphasis on reliability and performance, the second on flexibility. Eg adding members to data structures on the fly is a great idea in the second case, but not in the first.

    Sure you can try to sweep that under 'different paradigms', and indeed you could mix two arbitrary languages in the same file using some delimited blocks for example, and call it 'one language with different paradigms', but why would you want to? The convoluted multi-paradigm monstrosity that is C++ is a terrible example to us all there, in my opinion.

    I think instead the shape of the future will be more like all those different languages that compile on the JVM - jython, Scala, Lua, and whatnot. They compile into interoperable modules without extra hassle, so in each module you can use the right tool for the job at hand.

    1. Re:Hardly - more like the different JVM languages by Twinbee · · Score: 1

      Are you sure we can't solve the problem eventually of having the best of both worlds - flexibility, and performance/reliability? Never is a long time...

      See my other post which answers some of the other points: http://slashdot.org/comments.pl?sid=1847736&cid=34090776

      --
      Why OpalCalc is the best Windows calc
  157. Re:The thing with ASCII [COBOL 2.0?] by Pharmboy · · Score: 1

    I have the same issue with Perl, I use it for a big project about once a year, even though I write small scripts (50 lines) with it regularly. And yes, I spend about half the time relearning or finding new ways to do things for the job. This also means my "style" has been very fluid over the years. I go back now to make changes to a 5000 line Perl program I made 7 or 8 years ago, and I'm like "wtf did I do it this way?".

    That said, Unicode would make it even harder since 99.9% of my programming is through a ssh shell. What I don't need is more characters and to have to remember odd character key combinations.

    --
    Tequila: It's not just for breakfast anymore!
  158. I agree with you, and Stroustrup by seanellis · · Score: 1

    It seems to me that this is an editor problem. And a lot of the blame for the parlous state of editors at the moment can be laid at the feet of Cpp, the C preprocessor.

    "In retrospect, maybe the worst aspect of Cpp is that it has stifled the development of programming environments for C. The anarchic and character-level operation of Cpp makes nontrivial tools for C and C++ larger, slower, less elegant, and less effective than one would have thought possible." - Stroustrup, Design and Evolution of C++.

    We should have a much better view of a program than a bunch of files containing characters.

  159. Mr. Kamp... by Stavr0 · · Score: 1

    Sir, Please Step Away from the APL keyboard.

  160. Live ASCII or die. by John+Sokol · · Score: 1

    I'll give you my ASCII when you pry it from my cold, dead hands!

    But seriously. EBCDIC would work just as well.

    ASCII is bad enough with hidden characters and where tabs and spaces look the same.

    Where 1 & l & I or 0 & O or ' & ` are nearly identical in the wrong fonts.

    How many times have I tried to compile only to get errors related to some invisible character that was imported from DOS or some guy's weird editor in Korea.

    Really we want simpler. 8 Bit's is 2 bits to many already, This guy wants 16 Bit characters or 24 Bit.

    Imaging 20 varieties of A that all look the same but behave differently!

    I'm tell you right now. I will be doing ASCII for the rest of my life. I don't even like GUI IDE's I still prefer VI!

    Imagine what a mess when you have 20 A's that all look identical but

    --
    I am always doing that which I can not do, in order that I may learn how to do it. - Pablo Picasso
  161. I concur with needing more symbols by lsatenstein · · Score: 1

    I used to be an APL guru, programming in VSAPL, APLSV' and some other APL versions. APL is super fast development for many problem solving issues. It could have a rebirth if we could extend ASCII,.

    --
    Leslie Satenstein Montreal Quebec Canada
  162. using the right hand side of the screen to program by TheCouchPotatoFamine · · Score: 1

    I'm going to go out and say the problem this guy has with ascii is probably with editors and GUIs, as you mention in passing, then with ASCII.

    Editors can and SHOULD go much, much farther in mating code (functions written as they are now) with structure, that is, functions grouped and abstracted (and available to be edited) in groups that do not need a top to bottom representation. We ARE NOT talking diagrams here - that implies a tree or flow of direction. The flow is defined via function calls in ascii, like normal, and no connecting lines are needed. What is needed is the idea of groups (or aspects, or categories) that can reveal structure on a less restricted plane (pun intended) then a top to bottom file structure. Files are an unnecessary and evil hold over from people building what the they could not what they wanted to. Think about it; the flow of a program in general depends NOT AT ALL on the linear presentation of code that's rampant (let's ignore python's use of files-as-module, that can be tweaked).

    The reason I bring this up is that the argument here is that the manipulation of symbols using ASCII is tedious, slow, constrictive. I submit the time spent and problems encountered ARE how symbols are presented and analyzed (esp.for people learning a code base for the first time) -- but on the function or class level, not the individual character.

    I strongly believe (and will build eventually if I'm not beaten to it) are editor environments that succesfully understand code to deliver us from the ultimate tyranny of the file itself.

    And that's my rant :)

    --
    CS majors know the time/space tradeoff, but they never get taught the 3rd, crucial, tradeoff of the set: comprehension!
  163. Then why not get rid of English as well? by cpghost · · Score: 1

    There are some esoteric non-English based programming languages out there. Just imagine the fun porting this and similar OSS programs!

    --
    cpghost at Cordula's Web.
  164. Not so much confused as imprecise by Firethorn · · Score: 1

    You're a bit confused---Classical Chinese had the 'one word one character' thing, and Japanese has three character sets (five if you include Arabic numerals and the extensive use of the Roman alphabet).

    Less confused than imprecise, I think. Both have a 'one word one character' set, with the confusion that in China you also had multiple spoken languages all using the same written language. China also ended up with phonetic characters as well, so there's a different character set. Thus my use of 'like 3 character sets', because how many sets they have depends on how you or your school defines them.

    Book sales are amongst the highest in the world, and Japanese newspapers have the highest circulations in the world. The extra time spent reading in school must be paying off for the Japanese.

    You have a point there. I used to be a book a day guy. Eyes can't take that anymore, unfortuantly, I actually find a monitor easier to read off from these days, but so many e-books are so annoying with the software I prefer (free) fanfiction. Sure, there's a lot of dreck out there, but there's enough that the best 1% rivals commercial books.

    --
    I don't read AC A human right
  165. Kamp is an illiterate idiot by whitroth · · Score: 1

    I can see it now: code written in fonts that were only used once, and then no one ever wanted to use them again. I've got friends who did that, but they got better....

    So, they're not teaching logic and functions anymore, it's all Magic! what computers do....

                  mark

  166. Microsoft's Unicode guy by toby · · Score: 1

    Michael Kaplan has an interesting blog.

    As unlikely as it sounds from context, he seems to care a great deal about correctness. It also paints a vivid picture of how hard Unicode is to get right.

    --
    you had me at #!
  167. UTF-8... by toby · · Score: 1

    by moving to unicode, git svn bazaar mercury and cvs all have to be updated to understand how to treat unicode files - which they can't (they'll treat it as binary) - in order to identify lines that are added or removed, rather than store the entire file on each revision. bear in mind that you've just doubled (or quadrupled, for UCS-4) the amount of space required to store the revisions in the revision control systems' back-end database

    There is this thing called UTF-8 which VCS already handle just fine (including even humble CVS, afaik).

    Not appreciably larger. No larger at all, for characters in ASCII set.

    --
    you had me at #!
  168. Unfortunately, by toby · · Score: 1

    Railways, like character sets, are one of those situations where "close" doesn't quite cut it.

    --
    you had me at #!
    1. Re:Unfortunately, by rubycodez · · Score: 1

      the idea was that they're very close to dimensions of medieval carts and the paths/roads made for them, and really not urban legend at all despite what Snopes says.

  169. vi? by danlip · · Score: 1

    The author admits writing the article in vi. Can we mod the TFA as -1 Hypocritical

  170. Even plain ASCII is too much for Google. :-( by Something+Witty+Here · · Score: 1

    > Everyone who tried to do something useful in APL, put up your hand.

    APL is a wonderful language.

    > Restricting digital storage to ones and zeros is needlessly polarizing
    > and limiting. Why not allow a 0.5 bit value?

    Word is the Russians tried to build trinary computers but the
    magnetic cores wouldn't stay unmagnetized.

    My stupid keyboard has redundant keys for the digits and a few others,
    but no Umlaut, no Eszett and no Greek letters. Who designs this crap?

    Some things can't even handle plain ASCII. Can anyone explain how
    to google for "DVD-RW" or for "DVD+RW" without getting a gazillion
    false hits? Google would be *so* much more useful it it handled
    regular expressions.

  171. We don't need C++. We need --C. by Anonymous Coward · · Score: 0

    A better way to go would be to *reduce* the number of symbols allowed in computer programs. This would reduce the number of errors, making programmers more productive.

  172. Unnecessary by Brafil · · Score: 1

    Well, if Perl can get along with pure ASCII... And humans are definitely not descended from octopuses, so no.

  173. I've thought about this - solution is ... by garyebickford · · Score: 1

    The same thing we already do for user applications - use I8n mappings.

    This could be done quite simply. Starting with the predefined symbols in the language ('+', 'sin', etc.), provide a translation table to any human language. Then at the top of the file, provide a DEFINE or equivalent for the language that says what human language the code is stored in (e.g, Greek).

    Then the reader can open it in his/her own language, for example in English. Then it will look like it was originally coded in English, can be edited in English, then resubmitted. Then the person working in Greek would be able to open it and it would be in Greek for that person.

    This can be extended to function names, etc. without too much additional work - the major work would be that the original coder or someone in the group that supports this open source (of course!! :) ) body of work, would have to build the proper translation tables.

    This should actually be easier at this level than human languages, because at this level, programming languages have regular syntax, have fixed semantic content and lack idiomatic expressions.

    Since there is already so much infrastructure for supporting and defining I8n translations, it should be relatively easy to modify the language compilers and interpreters to perform this step in a pre-parser.

    I was going to suggest this for PHP6 but I don't know if I got around to posting it to the PHP6 team.

    --
    It's easier to be a result of the past, but more fun to be a cause of the future! http://www.spacefinancegroup.com/
  174. I may have an old APL manual around... by CptNerd · · Score: 1

    If we want to go non-ASCII we could always switch to programming in APL (or maybe ObjectAPL), and have completely unreadable programs. Or else learn to program in Chinese.

    --
    By the taping of my glasses, something geeky this way passes
  175. 64 Characters should be enough for anyone. by Anonymous Coward · · Score: 0

    I vote for FIELDATA. Upper case letters only, just six bits per character. Aaaah, the days of core memory...