Mr. Pike, Tear Down This ASCII Wall!
theodp writes "To move forward with programming languages, argues Poul-Henning Kamp, we need to break free from the tyranny of ASCII. While Kamp admires programming language designers like the Father-of-Go Rob Pike, he simply can't forgive Pike for 'trying to cram an expressive syntax into the straitjacket of the 95 glyphs of ASCII when Unicode has been the new black for most of the past decade.' Kamp adds: 'For some reason computer people are so conservative that we still find it more uncompromisingly important for our source code to be compatible with a Teletype ASR-33 terminal and its 1963-vintage ASCII table than it is for us to be able to express our intentions clearly.' So, should the new Hello World look more like this?"
There are several potential problems, but also some very old solutions. The problem with your 'get a better keyboard' idea is that source code is another form of communication. I work on a couple of projects that have contributors in France, Britain, Taiwan, the USA, China, and Japan. ASCII is the lowest common denominator - all of these people use different keyboard layouts, optimised for their native language, but all of them can enter ASCII.
The solution was available in pretty much every Smalltalk implementation from the early '80s. Smalltalk uses the caret character for return statements, but pretty much any editor will display it as an up arrow (which doesn't appear on any character that I've seen). There's no reason why the editor has to display exactly the same characters that appear in the source code, or has to insert exactly the typed characters into the source file.
I am TheRaven on Soylent News
For programming languages, we already have that - \u1234 or \U12345678 are used as escape sequences in C++, Java and C# for just this purpose. There's nothing stopping an IDE from rendering them as if they were actual symbols and not escape sequences, too, though I haven't seen that in practice.
A lot of C compilers expect unicode source files. As I recall, the Windows headers are all UTF-16, so the MS compilers are designed to handle unicode input. Clang expects UTF-8 source code. The language rejects non-ASCII symbols in identifiers, but you can use them in comments and string literals. The D compiler ignores HTML markup in the source code (I think - it did ten years ago when I last looked), so you can mark up your source code in any way that you like and have this markup preserved, although it's semantically irrelevant.
I am TheRaven on Soylent News
>>On the other hand, you have German. They have so long words and even longer sentences
What are you talking about? In German, you can have words that are entire paragraphs. =)
For example, the Reichsdeputationshauptschluss of 1803 ended the Reichsunmittelbarkeit of the HRE.
(And no, Firefox, those wards are not typos, just sentences.)