Startup Uses AI To Create Programs From Simple Screenshots (siliconangle.com)
An anonymous reader shares an article: A new neural network being built by a Danish startup called UIzard Technologies IVS has created an application that can transform raw designs of graphical user interfaces into actual source code that can be used to build them. Company founder Tony Beltramelli has just published a research paper that reveals how it has achieved that. It uses cutting-edge machine learning technologies to create a neural network that can generate code automatically when it's fed with screenshots of a GUI. The Pix2Code model actually outperforms many human coders because it can create code for three separate platforms, including Android, iOS and "web-based technologies," whereas many programmers are only able to do so for one platform. Pix2Code can create GUIs from screenshots with an accuracy of 77 percent, but that will improve as the algorithm learns more, the founder said.
It only generates the layout files for the different platforms.
We've had RAD systems for decades. They make the first 80% easy, but not the last 20%. One is always dealing with things like legacy databases with goofy schemas and domain-specific intricacies.
Tools that may take longer to lay down the basics but can be tuned easier for specifies still seem the best bet.
Plus you have issues of mobile devices such that UI's need to be "responsive" to different screen sizes. These can take a lot of experimentation to get right because context is involved. They are solving 1990's problems.
Table-ized A.I.
Was the first version of ResEdit released in 1984 or 1985? In any case, for more than thirty years, there have been developer tools that allowed you to draw a UI screen, while simultaneously creating a WYSIWYG screen image, an object-oriented description of the elements in the image (e.g. "a checkbox at 50,100"), and code to generate the image.
As nearly as I can tell, the only novelty here is the ability to work off a static image file, rather than being able to work off the time-sequence of the series of drawing manipulations used to draw the file. This wouldn't be a big deal even if it worked, since it doesn't take very long for a human to look at a UI screen and draw a duplicate layout using a UI layout tool.
As for "77% accuracy," I have no idea what that means or how you calculate the percentage, but sounds like "it doesn't work," because the amount of work needed to correct something that is only 77% accurate is probably about the same--quite possibly more--than the amount of work needed to create it from scratch with a good layout tool.
Furthermore, it is very common for a UI layout to contain elements that are only conditionally visible. An obvious one would be a tabbed panel. A screenshot can show you the control that are in the frontmost tab page, but has no information at all that would allow pix2code to even begin to guess at the controls and other elements that are present in the other tab fields. Therefore, to get even a complete visual record of the interface, it is necessary to have some kind of procedure or script that results in every UI element being systematically revealed. That's not trivial. (Imagine some of the currently fashionable designs that save screen real estate by putting larger parts of the UI on invisible trays that only slide into view when needed).
"How to Do Nothing," kids activities, back in print!
Programs that can auto-generate glue code from a GUI input have been around for decades. You just need to fill in the stubs with the other 99% of the code. The problem with these GUI input systems is that the code they generate is fragile and even small changes are often more difficult than just starting over. I have never found them useful, but they are popular for iPhone and Android apps.
I read TFA, and it says almost nothing, but I think the new thing about this system is that you don't need to use a GUI input, and instead you can just show it a picture or screenshot of an existing GUI (say, your competitor's product), and it will auto-generate code to create that GUI, with stubs for the actual functionality. That seems pretty slick.
The output from a layout editor is a structured description of the components and their layout. This is inferring that description from a .png - the LSTM is building a description of the structural relationship between the widgets from the input image.
It looks pretty cool, although quite simple. The intermediate token stream that it is inferring may be more interesting as a design tool than the neural network on the front-end that is building it.
Slashdot: where don knuth is an idiot because he cant grasp the awesome power of php