Why Haven't Special Character Sets Caught On?
theodp asks: "Almost forty years after Kenneth Iverson's APL\360 employed neat Selectric hacks to implement Special Character Sets to express operators with a single symbol, we're still using clunky notation like '<>', '^=', or 'NE' to represent inequality and cryptic escape sequences like '\n' to denote a new line, even though the Mac brought GUI's to the masses more than twenty years ago. Why?"
Because it's simple, it's easy, and it works. Absolutely no need to futz with it.
Anymore pointless issues you'd like to hash out?
And special characters wouldn't be?
Gosh, I don't know!
Now, if you will excuse me, I need to create a local variable named <The Symbol for the Artist Formerly Known as "The Artist Formerly Known As Prince">
www.eFax.com are spammers
In programming? Most languages seem to be designed with ASCII in mind, so you have to stick with what's available there.
In general? I think it's a matter of input methods. Give me an input method where it takes only two keystrokes to type "" and I'll use it instead of "NE" or "". If I need to use a vulcan death grip, remember a code, or find it in a character map, I'm only going to bother when I have motivation: either making a point, like earlier in this paragraph, or making a polished document. Why go to the effort in a casual email, or a forum post, when it's much easier to type "" instead?
The only GUI I need is vi, thank you very much...
I mean, they're better, right?
I entered an actual not-equal sign in that post, and Slashcode stripped it out!
my OS is where UTF8 [pdf] was invented.
There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
\n is cryptic and APL isn't?
I'd say it's more a question of 'choose your poison'. There is a learning curve whether one aims at mathematics-based notation schemes or historical computer science notations, and the market has already chosen (30 years ago) which one it prefers.
And not without cause. Human language looks a lot more like modern programming languages than mathematical notation, and a major goal of programming language design is to make it as straightforward as possible to tell the computer what you want it to do. One might object that by that argument Cobol is better than C, but humans, especially experts working in a specific domain, like abbreviations too. Cobol is hated because it doesn't allow you to abbreviate, not because it is hard to read, after all. APL or other such specialised syntaxes are hard to read and they don't fit closely enough with the way non-mathematicians think to be intuitive.
Now sonny, sit down a second and listen to grandpa rant about the good old days. The truth is, when I talk about the good old days, it's not because the days were actually good. It's because I have a sucky memory and questionable taste.
Now it is TRUE that I once did do programming in APL. This was on an old Zenith 8088 based PC clone with 640K of memory, a CGI display, and a 20 meg hard drive. The system itself worked rather well. If you could work a line editor, the development environment was all you could want. The problem was all the little stickers that went on the keys. Every key mapped to about three other symbols besides the normal ones, and just about every key had a little sticker on it. It was NOT fun. Just because your computers can display characters that look like Chinese doesn't mean that it's a good idea.
Fascism trolls keeping me up every night. When I starts a preachin', he HITS ME WITH HIS REICH!
Because we don't need to change for the sake of it, to a system which isn't supported by a lot of software and hardware. Why not just change your software to interpret the characters as an image, like some already does with smilies?
Display was never the issue with APL. There are implementations of APL that use keywords instead of symbols. It's just that turning everything into an operator makes for really dense, hard-to-maintain code.
I'm reminded of Forth, which lacks APL's weird symbols, but shares its reputation for dense code. In its heyday, Forth programmers justified using it by claiming it made them more productive. And that's true — if you define "productivity" as "number of lines of new code hacked out per day". But code isn't just written, it's maintained, and dense languages are not maintenance friendly.
Because standardization of extended character sets, via Unicode, is a relatively recent development. Hence, there's a lot of software around that still doesn't handle Unicode.
For example, I switched to bash because tcsh didn't cope with Unicode. Mozilla's Unicode support is incomplete--card symbols defined in the HTML 4.01 standard don't show up properly on the Mac, even though it definitely has them in its standard fonts. Many text editors don't support Unicode. And so on.
In fact, it's only recently that Slashdot was fixed to allow us to use words like "cliché" and enter amounts of money in Pounds Sterling like £5.99, even though those 'special' characters were part of HTML 1.0. Forget about using the aforementioned card symbols on Slashdot—we got 1996's CSS a couple of months ago, maybe we'll get 1999's HTML 4 in 2008?
Next you add in the fact that most people are too lazy to even learn to spell correctly, far less learn how to type an e with an acute accent, and you have a recipe for today's state of the web.
GCHQ Quantum Insert installed. If only our tongues were made of glass, how much more careful we would be when we speak
{disclaimer: i'm a closet fontographer.}
.. and the description of these symbols is limited by strict hardware limits: economic, social, cultural elements all have a part to play in the definition of input devices. where i say QWERTYZXCV, you say QWERTZYXCV.
.bin, and "X" vs. "Y", blah blah, ad infinitum..
.. its only a tiny clique can do the alt-numpad thing, and even fewer who choose to jump out of the ASCII pool and towel off..
i've thought about this question since 1978, as i have encountered over the years since then a grand litany of different ways of describing symbols in such a way that they can be standardly used, and i have come to a very simple answer. humans are stuck on a symbol treadmill with infinitely smooth bearings.
fontography is a lesson of symbols
we haven't seen terribly wide-spread specialization of symbols because of the producer-/consumer- cults of USKEY101, and peoples unfamiliarity with alt-numkeypad chops, and Mac vs. PC, and ASCII vs. UTF-8, and XML vs.
the fact is, perhaps deep down inside we know we should be grateful for what we've got, and let the "!=" and ">=" expressions, 2 lonely bytes in a vast nasty sea, stand as testament to the human desire to at least, a little bit, get along on the same key. they may not be pretty, but pretty much everyone can get to those two bytes and use them when they need to
; -- the corruption of government starts with its secrets. a truly free people keep no secrets. --
The real question here is 'Gosh, languages don't use all the same syntax to represent mathematical ideas. Isn't there some way we could force them to do so?' And the answer is, succinctly, no.
And for non-visual characters like 'newline'.... what other idea, exactly, did you have? \n is pretty straightforward, once you know how it works. I submit that some random symbol would be worse than what we have now.
The musician Prince tried the glyph substitution trick, if you recall, and it wasn't tremendously successful.
I tried to learn Forth once... I thought it was made for people that thought asm was to straight forward.
There does seem to be a sweet spot for code density vs readability.
c and c++ seems to have found it for most people.
At one end you have COBOL, Pascal, and Ada.
At the other you have Forth APL.
c, c++, c#, python, java, and the other c like languages seem to be in the middle.
See my blog http://ilovecookes.blogspot.com/ for light hearted technical information.
Yep. Note that "smart quotes" didn't come into general use until Microsoft implemented them in the auto-complete features in Microsoft Word--the same ones that will do things like correct "cafe" to "café." Of course, they used non-standard characters to do it, leading to lots of Windows-only pages and programs like the Demoroniser, and they weren't quite smart enough, leading to millions of people worldwide using left-single-quotes instead of apostrophes for things like '90s or 'Twas--and not just in Word documents, either!
Still, it was the fact that it was, essentially, effortless, that led to people actually using them instead of the straight-up-and-down quotation marks and apostrophes that we have on the keyboard.
recent by geological time
UTF was invented and used as the *only* charset in an OS before Windows 95 was even in beta
http://plan9.bell-labs.com/sys/doc/utf.html
There are places where the networks are not touching,and there are places where they are-Boeing's Lori Gunter
I think a large part of it is because, even if we have the ability to display the characters, we don't have a convenient way to enter them. The keyboard doesn't have a Sine symbol key. Further, expanding the keyboard to include these symbols will just make it unwieldy. I suppose one could have the display automatically convert sequences into special characters, much like modern word processors perform auto-superscript, but this might cause problems when editing. I personally prefer it as-is.
</smartass>
Dewey, what part of this looks like authorities should be involved?
It's pretty simple: Lowest common denominator. Creating special character sets creates incompatibilities with other machines out there. That's why ASCII was such a boon, and why character sets like PETASCII, ATASCII, and others fell by the wayside. (And if you really want some character set fun, try EBCDIC sometime).
All you need to do is have an ASCII chart handy and you can deal with text on any platform. Plus, the special symbols aren't on my keyboard. If I have to go hunting through obscure key combinations I need a pretty damn good reason!
That's one of the worst Ask Slashdot articles ever - it almost tops this one. Is it meant to be a troll? What exactly is the connection between your question about special character sets and the link to Wikipedias "Apple Macintosh" entry? Apple fanboyism?
Back to your question: What should be included in the special character sets? Do we need a set for every programming/markup language?
There's one character set, it's called Unicode and we're using it everywhere. If two-bit Pearle code running braindead websites like Slashdot can't handle the entirety of it, then someone (hint: not me) might have a problem.
Suppose you go to the Unicode folks and say "lets use a spare set of codepoints to encode programming language constructs".
OK, so, which constructs?
Well, we've got the basic operators of C. Java and C++ can share a lot of those. Then we've got the stuff in Ada, they have a few of their own. And ocaml has a few more. Haskell can use some of the ocaml ones, but we'll distinguish them with a diacritic to mark them as lazy...
Oh drat, someone sent me a program in Perl, and I haven't got the right font. It just looks like line noise!
From the Wikipedia entry linked in the original post...
"APL, in which you can write a program to simulate shuffling a deck of cards and then dealing them out to several players in four characters, none of which appear on a standard keyboard." - David Given (?)
"APL is a mistake, carried through to perfection. It is the language of the future for the programming techniques of the past: it creates a new generation of coding bums." - Edsger Dijkstra, 1968
Now, really APL is just functional programming in disguise, with a pathetic lack of flow control constructs. Now, if you're familiar with Scheme or LISP, you end up wondering just what the hell the point is of doing a bizarre sequence of keystrokes to produce a strange glyph, when you could instead just type out "reduce" or "map" or whatever it is you wanted to do.
That said, if you're a Linux/X user and want to input oddball characters occasionally, I highly recommend getting familiar with xkeycaps/xmodmap. You can turn one of the more useless keys into Multi_key. You hit that key and do something like "e'" for é, "ss" for ß, "L=" for £ and so on.
Long ago, the Lisp Machines roamed the Earth. They have a pretty wide set of characters available, as well as a swath of bucky bits. When people were writing for just Lispms, they'd use the extra characters sometimes. But then the code couldn't be brought over to other machines easily, or sent to a line printer, or... well, you get the idea. When Common Lisp came out, it defined a "standard character set"... a minimal set of characters that a Lisp implementation must support, and portable programs could only use it. This put an end to the flagrant use of the non-portable characters.
Same goes for the Knight keyboard at SAIL, the MIT keyboards that spawned the Lispms, etc, etc. The characters that aren't part of ASCII-- and aren't on most people's keyboards, let alone font sets-- cause too many problems.
That said, AppleScript does allow the use of some other characters in the MacRoman set. For example, not-equals can be written with the not-equals character, or <>, as you prefer. Personally, though, I stick with ASCII. Not even INTERCAL was perverse enough to deviate from it.
Have a large number of individual characters rather than a few characters than can be combined in many ways?
Why you sound like youre in favor of CISC.
"Give orange me give eat orange me eat orange give me eat orange give me you." -Nim Chimpsky
\r\n, =, !=, etc... make sense to programmers. They understand the language. Just like the design of 32nd, 16th, 8th, 1/4, 1/2, and whole notes, along with extra notation to modify their true length of play and volume, makes sense to musicians. Why waste time and effort to make it readable for the masses when the masses probably don't care? If they did they'd learn to read the language.
COBOL is hated because it was a language designed for programmers not by programmers. Same as Java.
Try C, LISP, SmallTalk or Ruby and you start to feel that the language is helping you. Cobol & Java fell like they are getting in the way.
Persoanlly, I love ruby but that might just be me.
Because nobody wants to be seen typing on the "short" bus with a "special" character set.
--
"Outlook not so good." That magic 8-ball knows everything! I'll ask about Exchange Server next.
What might be interesting is if you can have your keyboard switch modes.
I could put the keyboard in math notation and automatically the keys on the keyboard display math symbols in a standardised pattern (like QWERTY is for letters but for math). Other modes could be added later.
On slashdot a few months back there was a keyboard in which the labels on the keys are dynamic. I think that is going in the interesting direction.
It reminds me of maybe how the computers in Star Trek Next Gen might behave. Where the terminal/key layout is specific to what you are doing (Engineering, medical, etc).
Basically instead of having a stupid windows dialogs where you click on stuff, you can use the keyboard designed to do your task.
It also amazes me seeing asian languages being typed in a computer.
You may as well ask why we don't use specialized languages for specific tasks, such as using C for pointers, Java for objects, FORTRAN for mathematics, etc. all within the same project, perhaps even in the same source file. Why should the compiler care what language we use? The computer can handle all those languages just as the computer can handle special symbols. Another similar question would be: Why don't we have specialized processors dedicated to certain tasks, like a speech processor, speech recognition processor, sound generation processor, video coding / decoding processor, I/O processor, and so on? I think the truth lies in the efficiencies of volume production. Specialization just isn't cost effective these days. People who do tend to reach out and try new ideas rarely get far today -- which is a real shame.
Ouch! The truth hurts!
There aren't so many of us in this thread with the background to actually do so. I learned APL when I took a calculas course at the local university because I was applying to schools out of province where calculas was taught in the terrible idea known as "grade 13". The math course allowed me access to the CS lab and I soon started to thrive on being able to write programs in APL that ran a *lot* faster (sometimes factors of 10) than any of the programs written in compiled Pascal by the CS undergraduates (what a bunch of lamers).
One needs to understand that in APL (as interpreted as it gets) is that certain idioms applied to large arrays are essentially super-compiled, in much the same way that the C++ non-container vector[[bool]] did funky stuff until they killed it (they have killed it by now, haven't they?) It was no small benefit to me in my jousting with the lamers that in APL I could focus all my talents on choosing the best algorithm, because I was never more than 80 keystrokes away from a major breakthrough. Except when the problem demanded certain kinds of data structures where APL was inherently unsuitable. It's a good thing no one ever challenged me to the fastest program to implement a position tree.
I learned a lot from my time with APL because it reduces certain aspects of programming to its bare essentials, while saying almost nothing about anything else. I learned which portions of the programming task can be delegated to a superior notation, and which portions can't. For the most part I liked the APL notation. Stupid manipulations of the rank vector (reshape) using the rho operator to effect extra arguments to a function call, that was life in the gutter. So were the trig operators for that matter.
At the same time, I don't miss most of the symbols / operators all that much either. There are three that have stuck with me ever since. The max/min (round up/down) operators (bar with a flat bit to the right at the top or bottom) and the high minus (minus sign tangent to the top of the regular digits).
Let's assume we have a C language implementation from the day when an integer was 16 bits and a long was 32 bits. For this C language host, the expression -32768 does *not* do what one first supposes: represent the least possible 16-bit signed integer value, because this is parsed as the negation operator applied to the long integer constant 32768. How stupid. Give me back my high minus sign! The constant ambiguity between subtracting a positive constant and adding a negative constant does *not* help one in recalling to mind the derivation of the formula involved (with the original symmetries intact).
Other symbols from APL I've continued to use in my own notation are the take and drop operators (preserving the overtake and undertake semantics) and the encode / decode operators. All the rest I've discarded.
The other bugaboo I've retained having learned early how to think APL-style are definitions of the modulus operator (% in C/C++) for a positive modulus B, does not ALWAYS return a value on the semi-open interval [0..B). Howls of disgust when this function does anything different for negative values of A, a fraud of the highest magnitude perpetrated by the same idiots who neglected to supply the high minus sign in every other language I know.
I must however reserve the innermost circle of hell, following along in P.J. Plaugher's footsteps, for those who implement malloc(0) to cause an "instructional" abend.
Unicode contains characters such as U+2260 (NOT EQUAL TO). Unicode has certainly caught on; all HTML documents use that character set, for instance. So why the need for a special character set?
Perhaps you are asking why people don't choose to use such characters - I guess it's just ignorance. After all, if somebody who has gone to the trouble of submitting an Ask Slashdot doesn't know about these characters, why would the average person? Take a look at the code charts sometime.
Bogtha Bogtha Bogtha
The concept that APL code is "hard to maintain" is correct to first approximation, but it's more myth than reality when one digs deeper into the question. Most of the densest lines of code I once concocted in APL were 100% maintenance free: efficient and correct over the entire usable operand range. The density of the code squeezed out many degrees of freedom for making stupid errors even before you began.
There were other factors, having little to do with code density, that made APL systems hard to maintain. One was the psychological feeling that written twenty lines of comments to describe one line of code was somehow ungainly. I overcame this feeling within myself rather early. In fact, I wrote so many lines of comments for each line of code that my first work-term supervisor wrote a program to crawl through all the functions in one of my workspaces to *remove* every line of comment I had written, because he somehow thought he would understand my code better if he could fit it all onto the screen at once. His problem was that he didn't understand the *concepts* in my code. One of the things about raw APL code is that there are few surface markers that distinguish necessary manipulations from deep concepts. In my C language code, the necessary manipulations are largely gathered together in the initialization statements for local variables. Well, how much a language design should be based on protecting the programming team from supreme idiocy?
The second factor that made APL hard to maintain is that that it tried to force every concept to become a nail. The array primitive was surprisingly powerful, but it just didn't handle certain kinds of data aggregates at all well. And neither could you push this structure in the lexical direction, because there was no regex facility either.
And finally, the notion of a "workspace" was itself suspect. Every function was it's own text. There was no text anywhere that declared or described or controlled all the global variables that the workspace would necessarily include. There was no textual grouping of related functions into a higher-order interface or language facility. These decisions were made because APL originated in the teletype era. It had nothing to do with expressive density.
I think there is also an illusion at work that if you spend the day performing a maintenance task by visiting twenty source files and pawing through several thousand lines of code, that you are working with some greater efficiency than the guy who spent the whole day staring at single screenful of starkly beautiful APL scratchings. That's not so obvious to me having been there.
OK, here's the thing. In APL if a programmer decided to cut corners and forsake the "stark beauty" that made APL a workable language, there was precious little left behind on the surface to betray the sloppy work standard. Take one look at a C program written by a programming in a sloppy mindset, you know right away you are maintaining the droolings of an idiot. In APL, it could take you an hour to parse beneath the surface to find that same incompetence leaping out. The C language has far more expressive scope for the droolings of idiots and I guess those markers are worth a lot at the end of the day.
I believe the Mac character problem is an issue with the fonts being old-style MacRoman versus Unicode.
Slashdot always allowed HTML entities in the old days but disabled them for a while because trolls were making unicode art with them.
. . . that it's much faster to write out an equation in LaTeX than using any WYSIWYG editor on earth.
If there aren't enough buttons on the keyboard for all the symbols you need, it's much easier to represent the new ones as an obvious and transparent combination of the ones you already have than it is to construct complicated schemes to generate new ones.
Until we experience a revolution in computer interface design, it's going to take at least two keys to create a not-equal-to sign. If people are going to be forced to remember a two key sequence, it's overwhelmingly easier if what they see on the screen is a literal representation of that two key sequence.
Going to special characters replaces a single translation step (representation -> function) with two translation steps (encoding -> representation -> function). If you ask me, memorizing twice as many things so that you can avoid seeing ugly NEs and \ns is just crazy.
What's more, even if you allow for special characters, there are still plenty of non-obvious representations to deal with. What's the math symbol for a conditional numerical equality as distinct from a definition, a string comparison, or a memory address comparison? Sure, you could standardize to a definition: triple equal signs, equal signs with little symbols above and below them, etc. By what's the point? If you have to learn the definition anyway, then it hasn't bought you anything over the existing schemes.
Finally, if I may be permitted to pass from serious critique into parody verging on a flame, why not go one step further:
We needn't stop with special characters. Why are we using clunky language-based schemes to represent all the other aspects of programming languages? Why haven't abstract symbolic representations for function and variable names caught on?
After all, "case variable of" isn't exactly transparent, and it isn't closely coupled to the english definitions of those words. Just think how much cooler it would look translated as "schematic symbol for a multi-throw switch, abstract geometric shape, cartoon human figure with arms held out in a 'who, me?' expression." That would make for some beautiful code!
OS X definitely has the symbols as Unicode fonts, and they work in Safari. If Mozilla is still a Carbon application rather than Mach-O, that really needs fixing... I remember seeing an "experimental" Mach-O build back in the days of Mozilla 0.x, I'd have expected the real thing to be up to date by now.
GCHQ Quantum Insert installed. If only our tongues were made of glass, how much more careful we would be when we speak
That's why there's a "Preview" button, loser.
The simple answer is: nothing is really gained by this. Why is english the language of the world? Because it is rather simple. Chinese won't replace english for the same reason. I would never ever consider writing a program with non-ascii characters, besides perhaps accented characters in strings. The reason is also very simple: have you ever tried to port a program to a different platform? I did it many times. Only the ASCII part survived. Sure, you may create a better standard than ASCII, but if you use it for programming, it will not improve your life. Abstraction is the core of every program. Nothing is gained by replacing != with something more readable if you have to define odd looking names all the time. In fact, it only makes it harder to read if you happen to run in technical difficulties with the new character set.
Actually, that's the symbol for a graphic representing a newline (a slightly raised N next to a slightly lowered L, shrunk and crammed together into an area approximately a single em-space wide), so maybe that's not such a good idea (as how would you represent the graphic itself in a string?).
OTOH, a \ followed by U+2424 could better represent a newline graphically in a string.
The reason that \n seems "pretty straightforward" is that most of us are used to it.
The concept of backslash followed by a letter representing a control character started in C in the 1960s (or possibly even in earlier languages), and has been copied into dozens of other languages, along with other things like using % in printf strings to format variables (although some languages, like Ruby, are starting to offer alternative representations to %).
Note that, in Common LISP, a newline is represented by ~% and ~& in formatting strings, and #\Newline (spelled just that way) represents a newline character outside of formatting strings.
In Object Pascal/Delphi, a newline is represented by its decimal or hexadecimal equivalent, #10 or #$0A.
Some languages, like Python and sh/ksh/bash/etc., allow an actual newline in a string itself, so no representation is necessary (although Python allows \n as well, in its non-raw strings).
Other representations that I have seen in the past include ^J and ^M^J (for line feed and carriage return/line feed as control characters) and $ (for end-of-line in regular expressions (although the $ doesn't (usually) match the actual newline itself)) and in "list" mode in vi.
Those who sacrifice security to condemn liberty deserve to repeat history or something. - Benjamin Santayana
So are you seriously asserting that \Unicode 2424 should be used in place of \n? Sure, it's pretty and all, but A) it takes a hell of a lot longer to type/specify using a keyboard, and B) common functions should be mapped to common characters. Newline is EXCEEDINGLY common, so it should be very, very fast to specify, not mapped to some obscure graphic buried somewhere in Unicode. (at least 2424 would be pretty easy to remember.)
Your observations of the alternate newline syntaxes were interesting, but I submit that they're probably all inferior to the old \n standby. They all require quite a bit more typing... and newline should be really, really quick to embed.
That aside, the whole thing of 'escaping' characters is a bit silly on some levels... because you run into all this weirdness when you're writing programs that make scripts (that possibly themselves make yet MORE scripts)... debugging how many escapes you need, and where, can take awhile. And escaping an escape character does something else AGAIN... this really does get pretty twisted. The whole IDEA of escaping strings may be a bit broken, a holdover from when we had only 64 or so characters to work with. Data being interpreted as a form of code is a bit dangerous, as we saw with all the formatting string bugs a year or two ago. But, like most dangerous things, it's also powerful.
I wouldn't mind if we were to do it some other way entirely. Unicode might work, but we have only so many keys on the keyboard, and proper Unicode handling is complex. It would require modifications of millions of programs that don't yet support it. Maybe Delphi's idea of an interpreted code fragment, or possibly some internal constants in languages, would be a better idea than using codes to embed escape characters.
I don't see things changing anytime soon, though. Like it or not, we're going to be stuck with \n for a LONG time. And, really... isn't it nicer than \U+2424?
Also, I'm not suggesting that someone should have to type backslash alt u 2 4 2 4 or something similar to get a newline.
Here are two ways to do it:
- Type \ <alt>+<return> (three keystrokes), and the keyboard handler sends \ U+2424 (two characters) to the editor, which places them in the text, or, even better,
- just type <return> (one keystroke), and the context-sensitive editor, seeing that the cursor is inside of a string, inserts \ U+2424 (two characters).
Note that many editors these days, such as vim, can know whether or not the cursor is inside of a string, and so it should be possible to have different keyboard mappings for the return key, depending on where the cursor is.That way, when you type <return>, the editor can insert \ U+2424 if the cursor is inside of a string, or <newline> if it is outside of a string.
Even better, the editor can insert \ n into the string so that it's valid ANSI C/C++, but display it as U+2424 in a different color to make it stand out as a newline character.
That way you get the best of both worlds.
Those who sacrifice security to condemn liberty deserve to repeat history or something. - Benjamin Santayana
That's actually not a bad idea. Wonder if someone will pick it up and run with it?
In fairness to the OP, if, like many programmers these days, you've never programmed or seen anything *but* C/C++ and Java, you might think the syntax is very different.
Eh? I'm not an old fogey, so maybe C implementations back then sucked, but -32768 does indead mean the least possible 16 bit signed value. And C doesn't have a (binary) negation operator. "-32768" is a constant, it's not parsed or imlemented as "0 - 32768".
The constant ambiguity between subtracting a positive constant and adding a negative constant does *not* help one in recalling to mind the derivation of the formula involved (with the original symmetries intact).
I don't know what number system they use in APL, but in the one the rest of the universe uses, adding a negative and subtracting a positive are mathematically identical.
I must however reserve the innermost circle of hell, following along in P.J. Plaugher's footsteps, for those who implement malloc(0) to cause an "instructional" abend.
If you could provide a situation where malloc(0) wasn't a logic error, then perhaps the people implementing it might be more interested in making it work.
Well, ":imap ^Q^M \n" works in vim to insert (the two-character sequence) "\n" when <return> is hit, but I can't figure out how to detect when the cursor is within a string.
It would be nice if autocommand had a mode that would fire when entering or leaving a syntactic region, so that I could map and unmap the key that way.
Oh, well; I will play with it some more when I have time.
If I can figure it out, I will also set it up so that "^I" (tab) inserts "\t", etc.
Those who sacrifice security to condemn liberty deserve to repeat history or something. - Benjamin Santayana
I am using expanded character sets. I've been using "≠" and friends in AppleScript for years. In Scheme, I use "λ" instead of typing "lambda". I use the native2ascii program that comes with the JDK to use Kanji or Esperanto characters in identifiers. I wrote a similar preprocessor to expand "≠" & friends in C source.
I can't tell you why it hasn't caught on, but there's nothing stopping anyone from doing it today.
(Although, it seems Slashdot doesn't like those characters...)
That demonstrates a lack of vision.
MAKE A NEW KEYBOARD.
Not that hard to do. Almost all computers have function keys on top. The majority of users DON'T USE THEM.
Just print up some new keyboards that have single symbols representing the major programer stuff, such as >=, To use them, print them above the F1,F2,F3, etc. access them by typeing shift F1, etc. etc. Allow them to be over-riden by programs that want to over-ride it.
If Apple did this, it would catch on instantly. In one year, Microsoft would steal the idea.
excitingthingstodo.blogspot.com
If you think Java's syntax is radically different from C's syntax
Syntax is one of the least important features of a language, and OP never said anything about them having different syntax.
OP wrote:
"Try C, LISP, SmallTalk or Ruby and you start to feel that the language is helping you. Cobol & Java fell like they are getting in the way."
One can infer that he's in favor of strongly, dynamically typed languages with the exception of C (weakly, statically typed), and against strongly, statically typed languages (or at least those without type inference).
I can venture a guess with reasonable confidence that he'd like python and scheme and would strongly dislike Pascal and Ada.
For languages that fall outside of the spectrum he described (say, Tcl and VBscript, or Dylan and ML) I have no feel for what he'd like.
Those guesses have nothing to do with syntax.
rage, rage against the dying of the light
I ain't your son, got that pal?
g uage
Compare COBOL to LISP which one came first? Which one can be applied to solving the most problems? Which one provides the most abstractions to the programmer? COBOL was in no way a step forward from LISP.
Below is some history of the languages, pretty accurate IMHO.
http://en.wikipedia.org/wiki/COBOL
http://en.wikipedia.org/wiki/LISP
http://en.wikipedia.org/wiki/Java_programming_lan
You'll see that COBOL/JAVA was designed by committe before being in general use. Most succesful languages eventually get a committe but hopefully the langugae in use is strong enough to stop major changes being made by the committe.
I've coded commercially in all the languages listed except 701, I have used Z80/6502/68000 though.
As for C/Java the syntax is simialr, java was designed not to scare C/C++ programmers. But I never mentioned syntax, that may be how you judge a languugae but to me it's a trivial matter, what's important is what the language let's you do e.g. does the ANSI COBOL standard have recursion yet?
Loser or not, he's got a point, asshat.
everybody knows the real losers are teh ones who use the preview button. although it's quite sad this fuckup sat there for an hour just to submit a comment explaining his mistake. then again, maybe I'm the one true fucking idiot who doesn't know the way to post over 1 comment an hour.
Squeak, an actively-developed version of Smalltalk that traces it's roots back to Xerox PARC, does use a few. When you type an underscore, it renders it as a character that looks like "<--" (but it's a single char).
:=, a 2 character combination which was chosen to "look like" the original left-pointing-arrow that was used at Xerox. Squeak allows the := as well, but it doesn't look nearly as cool as the leftarrow - it's used for an assignment operator)
(Other Smalltalk variants typically use
Similarly, ^ renders as an up-arrow, not just a carat. It hearkens back to Xerox custom displays with custom character sets. Yes, it only works in the Squeak environment where it renders its own characters to a bitmapped (bit-blitted?) display, but it is the only surviving case of the custom characters that I know of (at least, that I use?)
Mike
No, you're not my son, you're just another young moron who thinks links reflect knowledge. Of course, if you read your links you'll see that COBOL was driven by FLOW-MATIC; Java wasn't designed by a committee, but the version of C you've most certainly used was; and that LISP, FORTRAN, and COBOL are in fact exactly contemporary.
If you had much deep knowledge of programming languages --- or had read the links you posted --- you'd also realize that Java has more in common with Smalltalk than pretty much any other conventional language in any way except syntax.
What you probably don't know is that I stopped sleeping with your mother when I realized she'd have children that were ugly and dress funny.
Ahh,
my mistake I took you original post as having some slight oversights that I could gently point in the direction of understanding. My mistake, won't happen again with you. Apologies for wasting your time.
For crying out loud we coders have contributed a lot to computing (!) At least give us an assignment operator and a new equality operator on the keyboard. It's not as if we aren't sure we're going to need it in a few years!
I18N == Intergalacticization
That is, '←' (left arrow)
I18N == Intergalacticization
Dang slashdot won't display the ≠ properly!
Then the text editors could then gradually migrate to converting != on input to ≠ in the text and displaying it. For instance, notepad on windows will display unicode properly. Most other editors will too, like eclipse...
I18N == Intergalacticization
By which I mean not only that the compilers would be able to compile UTF-8 files and so forth but recognize the unicode symbols as valid operators.
I18N == Intergalacticization
Actually, I think it's worse than that. Perl6 is adding unicode support in perl source. So you think they'd be able to add support for all sorts of new operator. But from what I can tell from the mailing lists they are just over loading the hell out of the question mark.
@list = ? 1 2 3 4 5 ?;
@list ? grep {$_ % 2}
? map {$_ ? 2}
? @newlist; # 1, 9, 25
@list3 = -? @newlist; # -1, -9, -25
@list4 = 2 *? @list3; # 1, 81, 625
@list5 = @list3 ? @list4; # -1, 1, -9, 81, -25, 625
%hash1?a? = "alpha"; # { "a" => "alpha" }
I think it's going to get a lot worse before it gets any better.
you want a special key for programmers ONLY?
good
try typing © in Turbo C++ v3 (The old IDE which EVERYONE in the c++ world has used, seen, or should do so, or at least the RHIDE clone)
You CAN'T. That's because old legacy code such as that of Win Xp's "NT Virtual DOS Machine" (ntvdm.exe), TC++, MSDOS and others, don't support the extended character set for direct input AFAICT...
Also... keyboards are designed for the "common" man.. you think Jackie Thommpson emailing his mommy about how he tricked the San Andreas police into arresting a cartoonist in a dream, will EVER use the proposed '!=' key? In his life?
Actually, I'd say Java has a lot more in common with Ada than it does with Smalltalk. Especially the way packages and threads work.
Software sucks. Open Source sucks less.
Well, of course Smalltalk doesn't really have either one, or even separate compilation, so there are similarities to Ada, C, C++, Eiffel, and so on, that distinguish Java from smalltalk. On the other hand, Java has the bytecode interpreter basis, closures of a sort, garbage collection, and a large standard class library with lots of GUI and network richness.
Perhaps more to the point, Jim Gosling has been quoted as saying that Java was based on trying to bring Smalltalk to C++ programmers.
Interesting. I guess it matters which part of the language you look at. Thanks for the reply. Good points, except for the closures bit -- inner classes (which didn't come about until around Java 1.1) don't really compare to closures.
Seems like if he really wanted to bring Smalltalk to C++ folks, he really should have found a way to popularize Objective C though. Especially since Sun was a part of OpenSTEP, which included what's probably the best GUI framework around to this day. It sure beats AWT and Swing. I suppose the mixed syntax of Objective C is a problem for folks not familiar with Smalltalk.
I still think Java has the Ada "feel" to it. It feels "big", with its large library, and has a syntax that's overly verbose in some places. I can't think of anything in Java that reminds me of Eiffel though. (Could be because I never used Eiffel; I only read the book.)
Software sucks. Open Source sucks less.
Anonymous inner classes are closer, but I think I said "closures kinda sorta" or something to that effect anyway.
The problem with Objective-C --- well, hardly the only problem, let's say a problem --- is that it loses the underlying virtual machine and thus shares C's problem that a large part of the behavior of a program can't be predicted until you know what machine its going to execute on. It's also relatiuvely hard to learn, since it has two syntaxes, C and Smalltalk-like.