Slashdot Mirror


Startup Uses AI To Create Programs From Simple Screenshots (siliconangle.com)

An anonymous reader shares an article: A new neural network being built by a Danish startup called UIzard Technologies IVS has created an application that can transform raw designs of graphical user interfaces into actual source code that can be used to build them. Company founder Tony Beltramelli has just published a research paper that reveals how it has achieved that. It uses cutting-edge machine learning technologies to create a neural network that can generate code automatically when it's fed with screenshots of a GUI. The Pix2Code model actually outperforms many human coders because it can create code for three separate platforms, including Android, iOS and "web-based technologies," whereas many programmers are only able to do so for one platform. Pix2Code can create GUIs from screenshots with an accuracy of 77 percent, but that will improve as the algorithm learns more, the founder said.

6 of 89 comments (clear)

  1. RAD Redux by Tablizer · · Score: 4, Interesting

    We've had RAD systems for decades. They make the first 80% easy, but not the last 20%. One is always dealing with things like legacy databases with goofy schemas and domain-specific intricacies.

    Tools that may take longer to lay down the basics but can be tuned easier for specifies still seem the best bet.

    Plus you have issues of mobile devices such that UI's need to be "responsive" to different screen sizes. These can take a lot of experimentation to get right because context is involved. They are solving 1990's problems.

  2. Re:I RTFA by HornWumpus · · Score: 4, Interesting

    Yep, everybody did win. Back in 1990, when Rapid Application Development (RAD in the hype of the day) tools did this.

    Your IDE still has this feature. Drag and drop UI drawing is better than having any UI inferred from a drawing. How do you draw a mask?

    Trying to do this well, transparently for multiple different devices plus 'browser' is challenging, to say the least. But...GOOD NEWS...each of these markets is big enough to support a UI team of it's own. Claiming to do it well, automatically from a 'napkin sketch', for all significant platforms is braggadocious to the point where adults start to whisper about where the person's keeper is, calling him 'Sheldon'.

    But we all remember being 22 and doing similar; 'that's easy, just...' The time it takes from 'that's easy' to 'uhh....shit' is what separates success from failure, long term.

    The best, this will do is produce a 'wrong' (control behavior from a drawing?) UI for a 'sketch artist' who hasn't bothered to learn to use his IDE. Somebody still has to come along, muddling through the messes (one per target), and fix it.

    --
    John McAfee 'It was like that time I hired that Bangkok prostitute; to do my taxes, while I fucked my accountant'
  3. It just accepts an image instead of a drawing? by dpbsmith · · Score: 3, Interesting

    Was the first version of ResEdit released in 1984 or 1985? In any case, for more than thirty years, there have been developer tools that allowed you to draw a UI screen, while simultaneously creating a WYSIWYG screen image, an object-oriented description of the elements in the image (e.g. "a checkbox at 50,100"), and code to generate the image.

    As nearly as I can tell, the only novelty here is the ability to work off a static image file, rather than being able to work off the time-sequence of the series of drawing manipulations used to draw the file. This wouldn't be a big deal even if it worked, since it doesn't take very long for a human to look at a UI screen and draw a duplicate layout using a UI layout tool.

    As for "77% accuracy," I have no idea what that means or how you calculate the percentage, but sounds like "it doesn't work," because the amount of work needed to correct something that is only 77% accurate is probably about the same--quite possibly more--than the amount of work needed to create it from scratch with a good layout tool.

    Furthermore, it is very common for a UI layout to contain elements that are only conditionally visible. An obvious one would be a tabbed panel. A screenshot can show you the control that are in the frontmost tab page, but has no information at all that would allow pix2code to even begin to guess at the controls and other elements that are present in the other tab fields. Therefore, to get even a complete visual record of the interface, it is necessary to have some kind of procedure or script that results in every UI element being systematically revealed. That's not trivial. (Imagine some of the currently fashionable designs that save screen real estate by putting larger parts of the UI on invisible trays that only slide into view when needed).

  4. Re:no need for AI by ShanghaiBill · · Score: 4, Interesting

    Programs that can auto-generate glue code from a GUI input have been around for decades. You just need to fill in the stubs with the other 99% of the code. The problem with these GUI input systems is that the code they generate is fragile and even small changes are often more difficult than just starting over. I have never found them useful, but they are popular for iPhone and Android apps.

    I read TFA, and it says almost nothing, but I think the new thing about this system is that you don't need to use a GUI input, and instead you can just show it a picture or screenshot of an existing GUI (say, your competitor's product), and it will auto-generate code to create that GUI, with stubs for the actual functionality. That seems pretty slick.

  5. Re:I RTFA by Fnkmaster · · Score: 3, Interesting

    Agreed. We built a nearly identical system (with OpenCV for morphological analysis and neural networks) about a year ago, as part of a larger AI-powered mobile app development platform.

    77% accuracy is not very impressive, but unclear what the training and test sets are here.

    The biggest functional win from this is actually getting sensible layout params from a designer's UI mockup - i.e. figuring out should this be right justified/left justified, should there be margin/padding here, etc. We solved that problem pretty well.

    Other challenges involve asset up-scaling, background image color extraction, etc. If you can take rough image mockups and output well polished asset packs, with vectorized images, layout files and stub code for developers to work with, that's a pretty significant win.

    We got that far with the project, but ended up shifting direction to a somewhat different market where there was more growth potential - literally nobody wanted to invest further in mobile app tools in 2016, AI powered or not.

    So yeah, cool proof-of-concept, but as a standalone offering this doesn't create much value. As part of a larger toolchain might be valuable.

  6. Re:no need for AI by smallfries · · Score: 3, Interesting

    The output from a layout editor is a structured description of the components and their layout. This is inferring that description from a .png - the LSTM is building a description of the structural relationship between the widgets from the input image.

    It looks pretty cool, although quite simple. The intermediate token stream that it is inferring may be more interesting as a design tool than the neural network on the front-end that is building it.

    --
    Slashdot: where don knuth is an idiot because he cant grasp the awesome power of php