Metafor: Translating Natural Language to Code
vivin writes "Computer programming is second nature to most of the Slashdot crowd. However, this is not true for the vast majority of people. Formal programming languages are not as expressive or flexible as natural languages. This becomes more evident when we try to translate user requirements into actual code. Researchers at MIT have come up with a program that bridges this gap. It's not so much a tool that turns English into code, as it is a program that translates requirements (in English) to code. When Metafor analyzes English, nouns phrases become objects, verbs become functions, and adjectives become object attributes (or properties). In addition to helping programmers visualize their program better, I think it also promotes writing concise (and therefore) requirements and descriptions. Metafor doesn't handle run-on sentences (or bad English) that well." Update For for the dupe. Not going well. Appreciate all the hate mail. Really encourages improvement.
Well, I doubt something like this would be used to write the next version of Gimp, but I can see its use in helping people to convey what they want a computer to do. Few people need to write programs and I don't know whether I'd want people who don't
understand computers to actually write them. But it would help when someone wants to make something like a 3D scene in Blender. It reminds me a lot of that episode of STTNG (Schizims) where Riker, Troy and Worf are telling the computer to replicate an alien room that they were in.
While this is a cute concept, I don't think you'll be seeing computer programmers disappearing any time soon. The natural language bent was the original point of high level languages. Early languages like COBOL, SNOBOL, and BASIC were all designed to abstract programming to a level of natural language. Save for BASIC's success as a beginner's language, none of them accomplished their goal. In fact, the "natural language" design of COBOL only served to complicate the language and cause a variety of errors due to missing periods, improper spacing, and other common typing mistakes.
;-)
It wasn't long before it was reul languages actually broke away from English and relied more heavily on easily-parsable, special characters to define structure. We can see the results of this in today's C/C++, Java, LISP, PERL (bleh), and Python languages. This new interface does nothing but try to perform some of the structural thinking done by the programmer. (Although I have my doubts as to its current real world ability.)
So the question that then comes to bear is, "Who would use this natural language interface?" Sadly, the answer is most likely "programmers". But why would a programmer use this interface if he has to be trained in computer logic in the first place? It would seem like an unnecessary level of abstraction that would only serve to hinder a programmer's natural abilities.
Of course, there is the documentation issue. Supposedly this interface will be useful for producing requirements in addition to code. But who produces the requirements? Not the programmer. That's usually the job of the business analyst, someone who may not even have experience with coding logic. And for code documentation, nothing quite beats the JavaDoc style documentation that has become popular in the last few years.
I think that research like this is interesting, but I doubt it will have many uses until AI and voice recognition improves to a level similar to that seen in Star Trek. Only about 300 more years and counting.
Javascript + Nintendo DSi = DSiCade
How well would Metafor handle English like "nouns phrases become objects"?
I'm sure we've all experimented with header files that define an english-like syntax for our code. We've dumped it for a reason - it's not as efficient.
Hiding what's going on "under the hood" is never a good thing. Good code, like good food, depends on good ingredients, and the knowledge of how to combine them.
Crap food, on the other hand, can be produced by anyone with a stove and delusions of cooking ability.
I don't mean to sound pessimistic, but remember who comes up with functional specs; managers. As a consequence, this poor program may well come up with a framework that matches exactly what was requested, but once it's put together, the suits will say "it doesn't do this". When it's pointed out that that wasn't in the spec, the inevitable response will be "but it was implied; it should be obvious that we'd need it to do that." This is just a core dump waiting to happen.
I wish that people spoke mathematically rather than poor and ambiguous languages that can now (supposedly) turn into (ambiguous) code. Can one really rely on translated 'code' like this. That's like sending an E-mail from speech-to-text recognition without proof-reading.
My Linux - (L)ove (I)s (N)ever (U)tterly eXPensive
and how exactly does UML translate a users requirements? it is another of these insane lets draw lots of boxes for years instead of developing a thing. whoops we ran out of money, bugger, type things.
nouns become objects, verbs become functions
Congratulations!!!
You've invented Smalltalk!
Welcome to 1980!
While the logic of the researchers' interpreter tackles only about 20 percent of the problem of full natural language programming, it achieves about 80 percent of the perceived rewards.
Ah, this old thing again.
The hard part about programming isn't turning basic English text in to half-assed code. If it were, then Google have built their company on just-out-of-college scripters and Visual Basic.
[Liu says] "Many subjects immediately identified the simplistic interpretation of the interpreter, and wanted the opportunity to rephrase their original wording to fix the error."
Yes, regular English is insufficient for programming. If a tool like this becomes popular, you'll need still need a special class of people to figure out what is needed and to figure out how to phrase the desire in the precise way that makes this guy's interpreter actually do what they want.
In other words, he hasn't invented a way to eliminate programming or programmers. He's figured out a way to make a programming language that is slightly easier to learn at first. But because it's removed from what computers actually do, much harder to use for anything serious. The hard part about programming isn't the month you spend learning Java syntax, it's the many years you spend learning to write code well.
Their theory appears to be that this will make programming easier to learn. I wish them the best of luck in that goal, but having seen over the years a number of graphical and natural-language programming tools vanish without a trace, I'm not holding my breath.
If Metafor cannot handle bad spelling or grammer, then maybe using MS Word to do it for you will help. This is ment to be a joke, and not insightful.
A lot of work has been on explicitly and unambiguously coding / capturing the semantics in natural language. True natural language programming might be impossible, but by chasing this Quixiotic goal, other more limited purposes might be enabled on the way.
What I've noticed a lot of times is that Engineers can't write documentation and requirements worth crap. Ok, so I shouldn't make such a blanket statement. Some engineers can write well, but most can't. I think this will do two things:
a) Help computer engineers describe their program's requirements better so others can follow them better.
b) Help beginning computer programmers make the connection between a natural language description of their program and the program itself. For example, this can help them understand concepts of OOP programming (nouns->objects, verbs->functions, adjectives->properties).
It's not true that programmers don't always write requirements. One of the classes I had to take in college dealt with the Software Engineering/Development Lifecycle. One of the things we had to do was to create a requirements specification for our program. We had to write it in a concise manner so as to map it to actual parts of our code. From the requirements document, we went to a UML software (Rational Rose) and from there, to the (skeleton) code. That's what this software does, except it encourages us to write proper descriptions in the beginning itself, and then maps that to skeleton code.
Vivin Suresh Paliath
http://vivin.net
I like
Yes, this is not a new concept. And yes, it's been on every science fiction writer's radarscope since the concept of a calculating machine was considered.
And it's still science fiction because:
- Language is situational, societal, and emotional
- Most people doing this sort of work communicate in English. So they assume English is a good place to start from. Unfortunately, it's one seriously illogical language to start from.
There have been attempts to create 'natural language' programming languages. And in the main they HAVE been successful. Sure, they are inefficient. But so is human communication.
Every psuedo-code compiler / interpreter that I've ever seen (since the 1970's) has simply been a programming language. Sure, maybe they're a little nicer to look at, but they will always fail the Turing test.
Want TRUE natural language programming ?
Develop a computer that works linguistically, not logically.
The trick is to not have morons write the requirements OR code the actual software..
Its amazing how much this helps.
Firstly, people for whom programming is too complicated should not code at all. We need less programmers building better code, not more programmers adding to the crap heap that the software legacy is as of today.
Secondly, I think that what is needed is the other way around : automated analysis of code and production of natural language reports that designers could browse more easily than the code itself looking for bugs or designing extensions and additionnal feature. They would then intervene directly on the code itself.
Sort of a souped up version of Knuth's literate programming, only with a much more radical transformation of the code for its vizualisation, bringing up the essential and critical aspects.
Think of how a reasonably complicated mathematical proof, say within the formal set theory, would look like in a math paper or book meant to be read. Compare with how it would be coded in a theorem prover. Different. Yet the former can be automatically generated from the later.
The key to high quality software is controlling the complexity in the inter-function domain; the ordering of and the relationships between functions.
Converting natural language to software does NOT address this problem. Natural languages are not expressive or fluent at rigiourously addressing complexity issues. Rather, formal methods address this problem; even weak formal methods such as state machines produce enourmous benefits to code quality, since they force the author to consider all possible outcomes from all function calls and at the same time, by rigiously, logically and consistently exposing the behaviour of the program, permit far easier code modification by later authors.
--
Toby
No matter where you go... there you are.
No fears, he's issued an update:
Update For for the dupe. Not going well. Appreciate all the hate mail. Really encourages improvement.
Nice. Someone should go to customer service 101 and grow up a little. Yelling at the people who (indirectly) line your wallet. Not a good idea....
Good quote, too many chars. Seriously, the slashdot 120 char limit sucks!
Update For for the dupe. Not going well. Appreciate all the hate mail. Really encourages improvement.
Unless you have an '*' next to your name, you don't pay for slashdot other than the electricity to run your box. Don't give me the "but the ads are there" tripe either. I know most of you either use Adblock or ignore them anyway.
The source is available. Start your own site.
I've never been one to complain about dupes. I figure I already get way more than I pay for from this site (which is zero). But if people are frustrated about dupes, maybe it's because it's an exceedingly simple problem to solve, and the Slashdot editors give every appearance of not bothering to lift a finger to solve it.
Patrick Doyle
I mod down every jackass who puts his moderation policy in his sig. Oh, wait a sec....
I would definately be skeptical with such technologies. SQL was an attempt to make database queries close to natural language. In my opinion, SQL is the most difficult language to do complex tasks with. True that it may be, SQL was originally designed for queries not stored procedures and such. The fact is that need was there and SQL had to try and support it.
The inherit problem with using a natural language, or something similar to a natural language for programming is that computers need exacting commands. While natural languages are more expressive, formal programming laguages are precise.
Which is the whole reason why there are often bugs. Computers are not smart, they can not guess what a user wants. They can only do exactly what a user specifies, whether it's right or wrong.
I mean, look at all the misunderstandings that occur between people. (Most) humans have a very sophisticated brain with a portion devoted to language. And still, we get things wrong very often when understanding others.
My point is I don't think there will be a replacement anytime soon for formal programming laguages. I know that is not what this tool is trying to do, but I think it's value is novel at best. There is a long way to go before someone can speak into a microphone and a computer can make the assumptions necessary to peform a task with a reasonable degree of success.
Since when did every artist have to draw realistic artwork? Some artists draw anime, some artists draw comics, and some artists do draw stick figures and it is art to a lot of people. Unless you are going to program the space shuttle or the new unmanned probe, you don't need calculus. I've never used calculus in any of my code, not even once. Sure if you need to do realistic modeling of physics for your rocket, you'll need calculus, but unless you are going to be a rocket scientist why should we require all programmers get the training and education of a rocket scientist and then expect to be able to say "Well we need millions of programmers!". You cannot have millions of rocket scientists, you can have millions of good programmers. Do we really need a new game engine for every game? Can we reuse one good engine made by one rocket scientist over and over again? In art you'd reuse, in science you'd keep reinventing the wheel over and over again. Which is more efficient? Open source is art, and in art you don't need to re-invent the wheel, its all about the expression of the code not the calculus, not the science, the expression. If you are working for the government on some top secret mission critical project or if you are working for NASA then I can understand why you'd need calculus and math. If you are going to get a degree in calculus and math just so you can sit in a cube and write simple C programs all day, you wasted your time learning bullshit you'll never use, and the school has filtered out millions of coders who may be able to write great code for portable mp3 players or great applications for windows but who may not have the skill level to write the code for the new smart bomb or unmanned drone technology. Just because I cannot write code for advanced robotics or aerospace does not mean I cannot write basic applications, and thats what most people need. Most of us are using basic applications and most applications which are profitable don't require any calculus or math at all to write. Look at the code to Enlightenment, theres no calculus in it. Look at the code to Gnutella, no calculus, so tell me how a calculus programmer is more valueable to the economy than an artist programmer when most of the good programmers arent made by the calculus programmers? Napster was designed by an artist, not a NASA scientist. When was the last time a NASA scientist designed anything for the masses?
To actually add content to my content:
How terribly difficult would it be to add a url checker. I can understand a dupe when its an article written differently by two publications, but a simple URL checker can state "this url was used in store XYZ [with link]," so the editor can determine if its a dupe story or just a url used in two different stories...
Good quote, too many chars. Seriously, the slashdot 120 char limit sucks!
The point is, unless you're going to program the robot or house or whatever yourself, it's going to be a problem to set up a standardized system that a program can understand. Sure, you can add in set after set of "dialects" for a machine to interpret, but in the end there'll always be someone who sees english different, and will put in something invalid. Closest I've seen is COBOL, and...well...we all know about COBOL don't we?
Update For for the dupe. Not going well. Appreciate all the hate mail. Really encourages improvement.
/. years ago.
Taco, you asshole, you've been duping stories for years. You've known about them and yet you've done nothing to fix the problem. Don't pull this sentimental BS about not "encouraging" improvement. If the fact that if Slashdot is your creation, and is your job, isn't enough "encouragement" for you to fix the problem of dupes, I don't think anything will be. From all appearances, it looks like you've given up on
-------
"Every artist is a cannibal, every poet is a thief."
Update For for the dupe. Not going well. Appreciate all the hate mail. Really encourages improvement.
Is he drunk or something (For for??) At least we know he reads the hate mail, so send more. Ignoring stupid avoidable fuckups certainly doesn't work.
They can either
1) implement a simple function that compares the main words in the article with recent ones, particularly URLs (ignoring some obvious generic ones, like the home page of newspapers). (For extra credit, spellcheck the fucking thing, and check that any URLS exist.)
2)Read the mail that comes in from subscribers telling them they've duped (apparetly that's mostly ignored; when I send it in it often bounces, some editora apparently have invalid forwardnig addresses)
3)Or use their own brains and just type one relevant word into the Slashdot search box:
Search 'metafor'
Metafor: Translating Natural Language to Code
On March 30th, 2005 with 170 comments
vivin writes "Computer programming is second nature to most of the Slashdot crowd. However, this is not true for the vast majority of people. Formal...
English To Code Converter
On March 26th, 2005 with 52 comments
prostoalex writes "Metafor from MIT is a code visualization utility, capable of converting high-level descriptions into class and function (or method...
Or a spell checker, or a secondary read-only review queue, or any number of other things....
"Update For for the dupe. Not going well. Appreciate all the hate mail. Really encourages improvement."
/. anymore, and wouldn't be familiar with what's been posted).
At LWE in January, 2001, at a conference on Slashcode, someone asked us at Newsforge how our site had so few dupes compared to Slashdot. There were two reasons:
1) We had a smaller audience of people to see the dupes we did accidently post.
2) We searched our archives before posting any story. We searched by story URL, by story keyword. We also generally skimmed the site when we weren't working to be aware of what was being posted.
Clearly, #1 is something Slashdot doesn't have working in its immediate favour, but #2 is something that shouldn't be too hard. Zonk, Timothy, Cliff, Simonker, etc, don't post dupes nearly as frequently as CmdrTaco. Hemos doesn't post often, but he also seems to be pretty dupey (understandable as he's not really associated with
The worst example is something like the PSP dupe story: Taco didn't even check out the games section, which had that story right at the top! A simple search for "PSP" and "Browser" would've shown it even if he never reads sections.
CmdrTaco doesn't read his own site. What does this tell you about how he feels about it?
I don't read Kuro5hin much anymore for the same reason. Complaining about dupes will just drive him further away, even though he still has to work on it by contract.
--
Internet Explorer (n): Another bug -- that is, a feature that can't be turned off -- in Windows.
Right. But computer science isn't (necessarily) about programming. If you want to program, learn to program. If you want to know how programming languages work, and why they work that way, do computer science.
Graphics is a big field of computer science, and you cannot understand graphics algorithms without calculus, complex numbers, and algebra. If you can't do the maths, you're stuck writing 2D scrollers. The 1% of programmers who actually use that "useless" maths are the ones writing the game engines.
Computer science is indeed a science (arguably a branch of mathematics). Programming may or may not be an art, but it is firmly grounded in science.
Oh, and point-and-click game creation has been done several times. It was shit.
It really bugs me that /. editors treat dupes as a sort of charming fact of life, as if dupes are among those imperfections that make life worth living. Dupes suck, if for no other reason, because they fork discussion, confuse the archive and make searches less precise.
/. crowd is able to code by second nature. How f-ing hard is it to have a dupe checker (even a simple on like what FortKnox is proposing). Why is it that a website that proclaims itself the bastion of all things FOSS has languished in mediocrity while thousands of competent coders are practically begging to write this feature into the site's backend? Of course, nothing will come of this, except more shoulder-shrugging and gee whiz, golly nonsense. I'm not trying to flame, but this sort of unprofessional, "friendly fuckup" attitude is what holds the public image of FOSS back.
It's especially annoying when the dupe article proclaims that the
Yeah I've had similar complaints about Use Cases. My biggest complaint is the typical round-pegs-in-square-holes of "describing what the user's experience should be does not come close to describing what the system should do"
In our practice we bridge that gap through requirements and logical design. It's inelegant and I don't like it, but when you're a consulting firm sometimes it's more important to be able to say that you use industry standard practices than it is to have the absolutely ideal solution.
I mean your homebrew project documentation standard might be really good. It might be the best ever invented. In fact, it might be so great that any programmer that looks at it can intuitively tell that it is superior.
The problem is that my clients are not programmers. My client's are almost universally unqualified to make any decisions about the fitness of development standards. If they were qualified to do that, they would either be developing internally or hiring contractors, not consultants. So using a standard that a third party has said is a good standard carries a lot of value over and above that of the standard itself sometimes...
I am disrespectful to dirt! Can you see that I am serious?!
Your points have merit, but what about average Joe that just needs some simple program or script and doesn't want to spend a few hours figuring out the syntax for it?
Everyone here seems to be quick to assume that the goal of a natural language like this is to be able to take the place of complex programs and what not. I see it more as a way for the average user to make on-the-fly programs for certain specific functions. The life and death stuff can come later, but for now, assuming this natural language stuff works out decently, if Joe is sitting in front of his computer and he wants to quickly do some calculations or compare some data, he won't have to search through hundreds of shareware or freeware sites, he can simply tell his computer "Create a program which does x y and x" and it will do it for him. It doesn't need to be 100% perfect; it doesn't need to be foolproof, it just needs to work well enough to get the job done for him.
Everyone here just seems drawn to the extreme examples where the applicability of such a system seems outrageous, but for everyday applications, it could be a very useful tool.
What?
In addition to helping programmers visualize their program better, I think it also promotes writing concise (and therefore) requirements and descriptions.
Nope!.
We don't need a way to transform natural language to program... we need a way to take a program code and produce natural langauge.
I don't know any developer that actually likes writing documentation.
Bet this
But not all code is written for business apps. What about scientists who understand logic fairly well, but who have difficulty in C or C++? (There are a surprising number of such people.)
Would you prevent a mother with no medical training to put a band-aid on her son's skinned knee? Sure, a trained doctor would know more about exactly how much (if any!) antibiotics to apply to the wound before applying the bandage, but surely you agree that the mother should be forced to pay his fees for this fairly trivial service?
I agree that this might not be for everyone or every project, and this particular instantiation might not even be for anyone or any project, but surely you must understand the importance of allowing the casual/hobbyist programmer to increase the tools in his toolbox!
Ben Hocking
Need a professional organizer?
> How terribly difficult would it be to add a url checker
These are paid editors who can't be bothered to read their own website. The problem is not technological, and doesn't require a technological fix.
I am no longer wasting my time with slashdot
No, an url checker isn't needed. They need to actually do their fucking jobs. Hell, I just read the site, and I know when a story is a dupe, as do hundreds of other readers. For a few of these guys, their ONLY JOB is to maintain the site. If they can't recognize when a story is a duplicate, they aren't doing their job. It shouldn't require an URL checker. These guys should get off their lazy asses. The problem is that some of us just keep coming back, giving them more pageviews, etc. If people got really sick of it and just came to /. less, maybe management would wake up and can these guys posting the stories. Sure, I know they started it, but that doesn't mean they're inherently more competent than anybody else.
I don't respond to AC's.
The use of special purpose languages is nothing new. They're used in mathematics, chemistry, music etc. ... pretty much anywhere where it is easier to use a special purpose language to express concepts than with a natural language. Sure there is a learning curve, but that is often the least of the hurdles in doing a reasonable job.
Take for example music: For the musically illiterate like myself, a music score means nothing. Yet, to someone who can read music it means a lot. I guess you could potentially write music in English: "Make a high pitches sound for a bit, then a lower pitched sound and then a highher pitched sound....". Reading that would be hell and it is imprecise as to what it means. The hard part to being competent at music is not how to read the score.
Likewise, computer languages are the least barrier to effective programming. What matters more are the concepts and being able to express them effectively. Sure, some languages are easier to use than others (eg. python might be simpler than C and just about everything is simpler then Brainfuck), but programming in a natural language would be damn difficult. We already have some simple programming languages like VBscripting and spread sheets for "soft tasks" like customising spreadsheets and wordporcessors etc.
Engineering is the art of compromise.
That would have been unnecessary, given a better citation. Which portions of Dijkstra's original argument are no longer applicable?
That was a weaker argument. Yet, as part of that generation, I wouldn't say there was "no problem". For many years I had to struggle against habits learned during that time, in order to use more structured approaches effectively. I have also seen similar struggles in my contemporaries.
DNA just wants to be free...