The State of Natural Language Programming
gManZboy writes "Brad Meyers (and co) of the Human Computer Interaction Institute at Carnegie Mellon have written an interesting paper about the state of natural language programming. They point out that well understood HCI principles aren't finding their way into relatively new languages like Java and C#."
If
Natural Language is not making its way into Programming
Then
Programming should make its way into Natural Language
Else
Continue
The article seems like some kind of summary. Unless I missed something important, like, a second page or something. But basically, it seems to suggest that, even after all these years, we still aren't any closer to having a natural way to program. Huh.
Hexy - a strategy game for iPhone/iPod Touch
Inevitably you end up with an artificially rigid language structure that sounds like something that nobody would EVER say. Perfectly easy to read, after all, who wouldn't understand what "ADD VAR1 TO VAR2 GIVING VARX", but who the hell would use the word "giving" in such a way. It's a nightmare to learn or write, at least for English-speaking people who would have to constantly fight years of learning to speak real English to make up for the fake english in the language.
If I have been able to see further than others, it is because I bought a pair of binoculars.
Well, duh! That's because if, according to the article...
> The goal is to make it possible for people to express their ideas in the same way they think about them.
#include // Do What I Mean
thingy main (thingy list) { Sort thingy // wave hands
No, like this
With the guy's name on the right
No, I guess the middle initial deserves its own column. No, I didn't think of that.
But don't print the middle initial.
No, not like that.
Eew, that font sucks.
Yeah, like that.
No, like it was before.
Yeah, no--wait. I gotta talk to my boss.
He said to do it like this.
But he didnt like it.
Fuck this, I'll pay some guy in India to do it.
}
Given the state of natural language on /. this isn't going to work :-)
John.
Is Macromedia's ColdFusion syntax. As it continues to become less tied to HTML it will be interesting to see where this goes.
But natural language requires more typing than say C syntax.
A EQUALS B
A = B
But does the thought process get speeded up. If so one needs to know how the gains and loss affect overall development.
I disagree with the article's assumption that interesting programming errors are due to people being unable to express themselves "naturally" in code. Rather, I find that almost all errors worthy of debugging come not understanding the problem domain correctly.
jeff
HCL (Hilbert Class Library) has little if anything to do with HCi (Human Computer Interaction) or HCl (hydrochloric acid). The article is about HCi.
On that site, there's http://www.alice.org/whatIsAlice.htm which says
So, this is just like Visual Basic. I know that can't be true, or else Microsoft would be marketing VB as NLP. So what am I missing?
Fascism trolls keeping me up every night. When I starts a preachin', he HITS ME WITH HIS REICH!
Cobol, anyone?
...
Multiply x by y to get something or the other
An interesting read.
Write a Natural Language Compiler and you'll find that programmers can't write in a Natural Language. Can you imagine what would happen when you have to understand, not the flow of the code, not the overall process of the application(s), but HOW the writer was THINKING when they wrote the code? I've worked on a couple interesting projects where the programmers originally were involved in the physical business process, and eventually ended up coding (don't ask). When I had to edit their code, there was NO way of understanding it unless you actually talked to them and realized how they were thinking about the problem. It's not that the code was so poor, but they wrote code based on how they'd seen the business operate, and that just didn't translate nicely into straightforward code.
:)
Personally, I don't see how creating a language that encourages this behaviour can be a good thing. Isn't this the point of learned programmers? The ability to translate real world situations into easy to understand processes? Then again, I'm no language development guru.
One of the big problems this approach will have to overcome (in my opinion) is that people generally tend to order their thoughts in a manner specific to their native language. A development environment that seems intuitive and easy to use to a native English speaker might be backwards or obtuse to a person who natively speaks another language. To clarify; I'm not speaking strictly of grammatical structure of language, but of a seemingly inherent difference in the way people learn things based on what language is used in the teaching. For this reason it has always seemed better to me for programmers to learn a new, common language (that of the higher-level compiler they are interested in) so that when they work with others, everyone is on the same page (similar to scientists and doctors using Latin nomenclature).
I'd imagine that a "natural language" system could be developed with different approaches based on the native tongue of the programmer, but I would think this would damage the benefits of commonality that other languages now enjoy.
That's about as far as I got. I guess he didn't really express his ideas in the same way that I wanted to think about them.
Which nicely illustrates the point that there's always a "semantic gap" associated with natural languages, which builds up because people have different ways of thinking. The semantic gap is even wider when one of the entities being communicated to happens to be a machine. There's a reason why traditional programming languages are precise and exact...it's so that the gap is reduced - the machine will do exactly what you tell it to do...even then we have a disconnect between what the programmer's thinking, and the code that he's writing.
An Indian-American Hindu committed to non-violent thought/speech/action alarmed by the global explosion of radical Islam
Natural language isn't precise enough for serious programming. I personally wouldn't enjoy typing so much for no added benefit. It seems like this sort of thing only has value amongst people who are learning to programming. Why would a mainstream language like Java or C# cater to this bunch?
"The goal is to make it possible for people to express their ideas in the same way they think about them." There's your problem right there :) I think they're probably not being adopted because in the world of programming convention is the key to interoperability. Human thought and language aren't so strictly tied to convention.
Luck favors the prepared, darling.
It seems to me that the steps in the Natural Programming approach are not at all novel and certainly not as useful as they appear. The authors seemed to have forgotten the train wreck that was AppleScript. The authors state that syntax in program languages are too complex. I would argue that the syntax of a programming language needs to be more complex then the syntax of a natural language. The sad fact is that English (and other natural languages) were not designed with enough precision for things like programming languages. For example: If in some natural programming someone were to state "if x or y do z" Does this statement mean that x and y need different values or can they both be true? One can't tell from looking at the statement.
One thing that programming languages force upon you (the programmer) is the ability to get what you want using the least possible resources.
Natural language, while easier for beginners, would make for horribly inefficient code and would be undesirable for any sizeable application.
"Ask not what your country can do for you." --John F. Kennedy
IMO it's nothing more than a better way to introduce *newbies* into programming.
Would would any programming want to code in english? To me this:
myvar++
makes more sense than:
increase the variable myvar by one please
Do we really want people who can't understand something as simple as "myvar++" to be programming in the first place? Seems to me we NEED a barrier to entry. There're enough lousy programmers out there already.
It isn't that there aren't any languages that follow these principles coming out; lots of them are. It's just that the only languages that have become popular ignore these principles.
The fact is that people don't care what's academically sound, or what people have "proven" is the best way to do things. In fact, the things people do care about are directly contradictory with what's academically "best". It isn't some kind of head-slapping coincidence that the new popular languages ignore "natural programming". It's the market speaking, and it's saying "we don't want natural programming languages".
How about an answer from someone more well-acquainted with basic human desires?
-mkb
YOU FORTH LOVE IF HONK THEN
And here's some filler text to compensate for /.'s sucktacular lameness filter. Blah blah blah. "It won't be any more frightening than the time I climbed up an elevator shaft with my teeth," said Sunny.
You cannot apply a technological solution to a sociological problem. (Edwards' Law)
Well, I'm not sure if it's that nobody read the article, or if nobody actually understood it, but.
:-) (And no, I'm not using Englishy COBOL syntax.)
We've had a lot of posts about "OH NO! COBOL!" Yes, yes, I agree with you -- pretending to be English usually results in awkward and unnatural syntaxes. One of the advantages of a formal syntax like most programming languages is that it clicks the brain into a different mode. (How many of you can read sigs like 2b||~2b? I thought so.)
But that's not really the paper's main aim. It makes a couple of notes that all of us, particularly those of us in language design, could benefit from.
1. People tend to deal with collections in the aggregate far more often than they step through them an item at a time. The example given was "set the nectar of all the flowers to 0." Look past the syntax for a moment and look at how simple that is.
2. Debugging the traditional way sucks. Did anyone actually read that bit at the end about the 'Why?' questions, and look at the screenshots? Holy crap. That's really impressive.
Of course, I may be biased, because the points made in the article are basically the same that underlie a language I'm currently designing.
The real problem is a lack of strong domain models for most real world situations. That is, if you're starting a project to emulate something happening outside of a computer, then there's a very large likelihood that you're going to have to build your own object model to describe the situation to the desired level of accuracy. Once you have that model, it's easy enough to say "do this until that happens", but there's a world of difference between that point and staring at a blank screen at the beginning of a project.
There's been some progress (depending on who you ask) to make this easier for those who aren't full-time programmers, such as UML and related design tools, but even these are mainly limited to building a high-level template of the final result so that a human can manually implement all of the details.
This may or may not be avoidable. Vernon Vinge (author and CompSci professor) refers to the "Age of Failed Dreams" where humans eventually concede that some things just aren't possible. Expecting a current deterministic Turing device to be programmable at the level where people interact with each other may very likely be one of those areas.
Dewey, what part of this looks like authorities should be involved?
Right now that happens - only the program gets generated by programmers (sometimes outsourced to India!)
Unfortunately, what the user says they want, and what they really want are usually very different things. Natural Language Programming really doesn't solve that problem.
The critical piece is the Designer, who sits between the end user and the programmer, and asks the tough questions: "Do you really want that? Let me explain the implications of what you just asked for." "How critical is that piece of functionality that you just added on a whim, but it just added 3 years to the project plan?" "You're asking for the data to be selected this way, but really there's no use for that - have you considered selecting the data this other way?" etc.
It Would be nice to send out the specs for the program and run it threw the parser and get the program you want but the truth is that normal Human Language wasn't designed for problemsolving espectilly in some of the details that programming requires. Things like nested Lists. (1,2,3,(2,4,3,2),5,2,(2,3,5,6)) Which are easy to learn to program and install are much harder with natural language.
Make a List with the values 1, 2, 3, then this is a list of 2,4,3,2, now we are back in the first list with some more values of 5 then 2, now we get an other list inside this list as 2,3,5,6 Now we finish both list.
As you see in english this is clumzy I am sure someone with a better master of english may be able to make it a little more percise but still just giving up and using the () makes it a lot easier to see and understand then using a bunch of words.
Most human languages were made Thousands of years ago. And came from languages 10s of thousands of years old if not Millions of years old. They were not designed for micro processing of infromation. They were required for more common sience reasioning. Which we as humans often fail a lot at and imagin how poor a computer would be a common sience.
If something is so important that you feel the need to post it on the internet... It probably isn't that important.
It is a matter of habit and training. I am used to think in terms of objects so any object oriented language is "natural language" for me. When I solve a problem I think of objects, methods, properties and how they work together. I don't have to translate from some abstract "natural" concepts to OO concepts. I am sure someone who is using lisp will see lists and functions in the same problem that I see objects and methods.
I understand that the goal is to have the user just tell the computer what to do in English. The problem is that English is not precise and is too ambiguous. I don't know if I would want to fly on an airpline if I knew the computer on board was programmed in English.
Why does all research like this seem to revolve around "toy" problems? They study non-programmers or, when they include real programmers, focus only on small tasks that can be completed in an hour or so.
Great, I accept that a new language can make toy problems easier.
However, I think the situation is very different when you have a real programmer working on a real program. Writing a real application, like a word processor or a web browser, is difficult no matter what language you do it in -- and I would argue that the difficulty doesn't vary much between languages. In fact, I would further argue that many of these research languages, while making toy problems easier, would actually make "real" programming substiantally harder, because the semantics of the language are not as formalized and thus more difficult to remember and deal with.
I'm certainly not opposed to advances in language theory and design -- our modern-day large applications would be essentially impossible to write if all we had to work with was machine language. But to be a major advance, a new language should focus on making real problems easier for real programmers, not making toy problems easier for non-programmers.
ZFS: because love is never having to say fsck
Granted, it was by no means a fast runner, but you could write more or less plain English to it:Who could possible be confied by this code?
Notice the brilliant little keyword called "it", that you could use with "put" and "get". Neat, simple, easy!
eulogy
"Good news, everyone!"
There are two main features of applescript. 1) The english like syntax 2) the ability to control other applications Of the two the second is by far the most important. But to gether they create a new programming experience. Because most of the complexity is sequestered in the applications you are comtrolling the applescript code tends to be very short. On average I would say my applescripts are about 30 lines with only 5 - 10 lines of working code the rest is error catching and handleing. Because of the syntax it is very easy to read the code moths or years later. Also the having short code helps. BUT the most important thisis tha becasue the code is sort YOU CAN START AGAIN. How much code is kludged because no one wants to rip out and recode 1000 or more lines of code? There is a real benefit in short code that you can read
Programmers sit serenely, silently coding for hours at a stretch.
Then they execute the code for the first time, see the results, and scream out SHEEEIIIIT, GODDAMN IT!!!
Hence, to an outside observer, the natural language of programmers is indistinguishable from a case of Tourette's Syndrome.
About the word "if": If bullfrogs had wings, they wouldn't bounce around on their little green butts.
The solution of course was to tell Apple Script that regardless of what hapens, just issue the open application command and stop caring. I spent a good hour or so digging through documentation until I finaly found how to do this, and the answer is so blaringly obvious that it makes one feel stupid when they realize they should have known it all along:That's it.
T Money
World Domination with a plastic spoon since 1984
Down that path lies madness. Or Adam and the Ants.
Fourth Generation languages expressed things in terms of the requirements, rather than the actual mechanisms involved. Such languages have been around for a while, but have largely failed in the "real world". Pushing the complexity into the compiler turns out not to reduce the errors and can actually severely impact both performance and stability.
Fifth Generation languages tried to solve the problems generated in Fourth Generation languages by assuming that the compiler wasn't complex enough. By adding a degree of "intelligence" to the compiler, it was believed that you could increase the abstraction further (thus allowing the compiler greater freedom to choose an optimal solution) and use learning to correct mistakes in methods. Fifth Generation languages were first introduced in the mid 1980s, but nobody could figure out how to get them to work as designed.
Fifth Generation languages forked, at some point, and the only really active area of research is in Genetic Algorithms. GA's are intriguing, but not practical outside of herustics.
Today, the only languages used in practice are Second Generation (assembly) and Third Generation (the C family). Object Oriented programming has allowed Third Generation languages to encompass problems that would otherwise be impractical below a Fourth Generation language. This has proved a far more fruitful line of research and development. Far more "real" programs exist in Java than exist in APL or Prolog, for example.
(There are plenty of Operating Systems written in C and C++. Could you even begin to imagine writing one in APL??)
Parallel Programming has gone a similar path. Complex compilers, such as "Parallel C" and "Parlog", and Parallel Languages, such as "Occam" and "SISAL", were intended to replace the complexity of hand-coding parallel functions in conventional serial languages.
Such methods died a death, as threading and event-based programming proved perfectly adequate for most purposes. Of the above, Occam is the only one I ever regarded as being particularly effective, but it's rare to see anyone still use it.
The "advanced compiler" approach being tried these days is OpenMP, but when you look at the relative amount of time and effort invested in solutions, far more people are working on code using PVM or MPI, maybe throwing a clustering system such as Beowulf or Mosix on top, than they are using OpenMP.
The experience of time appears to show that Third Generation languages are the optimal balance between compiler/language complexity and code complexity. As such, I expect it to be more productive to improve whole-code optimization and auditing tools for such languages. These don't require the compiler to be any more complex, as they can parse the input and/or output, in traditional Unix daisy-chaining style.
It's a small world and it smells funny; I'd buy another if it wasn't for the money; Take back what I paid (SoM)
a language that leaves all the verbs for the end of the sentence? A language that likes the modifiers to follow rather than precede their nouns?...my point is , you have one translation problem in going from high level language to machine language and another going from "natural" language to high level language. But a third problem is finding a culture-neutral natural language OR solving the natural language translation problem...and you have seen how atrocious babelfish results can be...we just aren't there yet folks! The ambiguity that must be dodged in going from normal human speech to a computer program hides in different places depending on the language, especially on which words have multiple meanings. And inflection? What are you going to do? program with emoticons?
I know that natural language is creeping into UI's in specialized search engines. If you know where to look, you will find natural language search features on Fidelity.com and perhaps other financial websites. These are much more carefullly bounded problems than the broad challenge of allowing a user to express a solution or algorithm for an arbitrary problem a computer could be programmed to do in, say C, but using ordinary speech. The article sited is interesting and it might make life better for us programmers but I am not getting my hopes up that more than incremental change to computer languages is around the corner.
SLASHDOT: news for people who can't concentrate on work or have no life at all and got tired of yelling back at the TV.
Actually you're much closer to truth that you think.
/. automatically assume that a natural language is a change to code syntax making it closer to English while keeping the semantics identical to C or Java. But it's the whole way around, folks.
People here in
A natural language as stated in the article would be closer to the programmer's mental model (and no, the mind of a programmer is still not equal to a Von Neuman machine). For the very least, such a programming environment would provide incremental changes and retractions to previous decisions, as those listed in parent's joke.
Programming is also something that is easier to express in a specialised language. Sure we can make some things more human readable but does that make it easier to understand? The hard part of programming isn't reading/writing the code so much as knowing what structures and concepts to use. Making programming more natural language like will not really make programming easier, you still need skill and practice. Using the music analogy again: I don't play music and can't read music score (the language of music). If Beethoven's fifth (if he ever had a fifth) was rewritten in a natural language it would not make it easier for me to play; I'd still need a whole lot of practice with a piano or whatever to play effectively. Relative to aquiring the piano skills, I expect learning to read sheet music would be relatively simple.
Where natural languaages might help is in system design and requirement capture. Still, however, I think that most often things go wrong because when people are expressing their thoughts in a natural language they use very woolly thinking and use vague terms.
Engineering is the art of compromise.
Sadly true. Will somebody actually RTFA and realize that 99% of the other posts are offtopic? The article isn't describing a new way to program the computer using natural language; we already have COBOL for that, and we all know how that approach works.
The article is talking about new programming paradigms, which deprecate the old Von Neumann architecture programming model and allow for a more flexible mindset while programming. Last generation scripting languages (Python, Ruby...) would be a step in that direction. The article is proposing to explore that domain scientifically.
Singularity: a belief in the "God" idea with the "demiurge" relation inverted.
bp
To compare the natural language (Noam Chomsky tells that it is universal for every human language) with any programming language it is quite non-sense: programming languages tends to unambiguation, to be context free and deterministic. It's quite similar to compare an image versus a verbal description of it: the image it is finite and unambiguous, while the verbal description only can be arbitrary.
I understand that the point of this thread is to find a way to remove or to light some "translation" between the "human idea" and the "human computerized/programmed solution". For me, as the years go by, C/C++ is another language built-in myself. I can convert problems into solvable ones via computing, quite on the fly (still planning and designing the solution, but the implementation itself comes in a natural way, like the water that falls down a river).
Where's my compilable flowchart? They're more universally understandable across human languages/cultures, including geek/wonk/artist/customer/PHB, than text. They can be intermediate-compiled to text procedures for lexical parsing techniques. And they're much easier to design, program, debug, maintain and document, especially for parallel/distributed/networked applications. They're natural language without speech. Where's my gcc flowchart preprocessor?
--
make install -not war
... make a good programming programming language. Mathematics has "been there and done that" with natural language versus a formal language. Why reinvent the mistakes of the past?
God that reminds me of the old "guess the command" puzzles in old text-adventures.
..(much typing, and thesaurusing) ..
One of the wall panels sounds hollow.
> PUSH PANEL
You can't do that.
> TAP PANEL
You can't do that.
> PRESS PANEL
Nothing happens
.
> CARESS PANEL
The panel slides open.
I hope NL Programming wouldn't be bringing back that sort of thing..
It is very difficult to write a context-free programming language, let alone a natural one! when we speak, everything is meant relative to the current context. There is no way that a mathematical abstraction can be made out of that, unless really powerful computers can try every production possible in the same time (thinking about quantum computers).
We humans don't even talk logically at times (logically in the mathematical sense). We say one thing, we mean another one. One of the most difficult things for new students is to get used to the strictly mathematical nature of computer languages. Computer thinking requires every bit to have its special meaning in the universe. Most people choke on that. The most capable programmers are those that can hold a mental model of the application, its various parts and as a whole. These types of people can translate requirements to code very efficiently, because they can reason about a program's state better since they remember the whole program and they can immediately recognize the consequences of any programming decision.
And when one becomes familiar enough with the way the computer works, then the verbosity really gets in the way.
What we need is a development environment that can reason about the state of the program. That's the root of all problems. Embedding state information in a program is something I haven't seen in any language. Most languages, if not all, work in the assumption that anything can happen anytime, and they don't have state constraints, thus allowing the programmer to make mistakes that could be cought in compile time.
The world will always need people who understand that asking for the last digit of Pi isn't a worthwhile request.
"Computer, sort this list of names, then beat me at chess without moving your queen, then formulate a method of reversing entropy." "Computer, tell me a joke."
If natural language aims at letting users tell the computer what to do in the terms they think about their tasks, the computer needs to be aware/intelligent to understand the requests. Otherwise there's always going to be a manual describing what you can and can't ask and how/how not to ask it. And people won't read manuals, they'll write programs that don't work.
And then, you and I will *finally* get programming jobs. :)
"A witty saying proves nothing." ~Voltaire
"d'Oh!" ~Homer
In a way, the languages of mathematics and music are natural languages. Someone didn't sit down one day and enumerate all of the rules for mathematical expressions, it evolved to suit the needs of mathematicians and has retained the flexibility that results from such evolution, much like "social" languages.
It's hard for programming languages to "evolve" in the same sense since they aren't "for humans, by humans", but we do try new language designs and find that some work better than others.
Some of the more "dynamic" languages go some way to enabling this kind of evolution. If I try to use an unusual construct in a mathematical expression, I'd probably follow it with a statement in English or mathematics explaining the meaning. If it was a useful construct, others will adopt it and slowly the explanation will become unnecessary. Likewise, in some languages we can define new constructs (within certain boundaries, of course) and tell the compiler what is meant by them in simpler terms, usually by writing some kind of function. Over time, popular constructs will be adopted as core features in newer languages. One example that springs to mind is the foreach construct, which does vary from language to language but arose because it was very common to want to visit each element in a list in turn and perform some operation on it. Modern languages have become a lot more expressive so this kind of evolution will probably become more common.