Slashdot Mirror


MIT Offers Picture-Centric Programming To the Masses With Sikuli

coondoggie writes "Computer users with rudimentary skills will be able to program via screen shots rather than lines of code with a new graphical scripting language called Sikuli that was devised at the Massachusetts Institute of Technology. With a basic understanding of Python, people can write programs that incorporate screen shots of graphical user interface (GUI) elements to automate computer work. One example given by the authors of a paper about Sikuli is a script that notifies a person when his bus is rounding the corner so he can leave in time to catch it." Here's a video demo of the technology, and a paper explaining the concept (PDF).

11 of 154 comments (clear)

  1. MMO macro maker? by visgoth · · Score: 4, Interesting

    This looks like a powerful tool for gold / isk / whatever farming. I'm tempted to resurrect my eve account and see if I can make an auto-miner script.

    --
    My patience is infinite, my time is not.
  2. My grandmother knows python by Anonymous Coward · · Score: 5, Insightful

    "Computer users with rudimentary skills"..... "with a basic understanding of Python"?

    1. Re:My grandmother knows python by Fred_A · · Score: 5, Funny

      "Computer users with rudimentary skills"..... "with a basic understanding of Python"?

      Computer users with a rudimentary skill who do not have a basic understanding of Python can always build a Python programming AI in Lisp (or at least that's what I gathered from the MIT docs I browsed) and thus save themselves the trouble.

      --

      May contain traces of nut.
      Made from the freshest electrons.
    2. Re:My grandmother knows python by AardvarkCelery · · Score: 3, Informative

      If a friend wanted to learn just enough programming to do a few light chores, what would you recommend? Python is arguably one of the easiest languages to learn. Randy Pausch used it for Alice, which has been successful for teaching middle school girls how to program. So if "computer users with rudimentary skills" means rudimentary programming, then that works for me.

  3. The Cow pat model by Anne+Thwacks · · Score: 5, Funny
    Yeah - lets hear it for a new development model:

    For years I have been asking for a softwsare development tool that allows me to write PHP code by throwing cow-pats at the screem with the Wiimote.

    And my colleagues wat a tool that allows dispatching my bugs with the Wii gun attachment they use in "Quantum of Solace".

    --
    Sent from my ASR33 using ASCII
  4. Program, NOT code. Think MACRO by SmallFurryCreature · · Score: 3, Insightful

    From what I seen is this a macro program that can use screenshots rather then key/mouse data to automate tasks. So you PROGRAM your PC in the same way you PROGRAM a VCR to record a show. It is NOT the same as writing an application.

    But it seems very intresting once you got past this difference. Macro's are very handy for testing in my experience but often have a problem because a tiny mis-alignment can ruin it all. If this program is smarter because it can regonize where data is supposed to go... well that would certainly make automated tests a bit easier.

    Interesting stuff. Just don't think you will be writing software with this.

    --

    MMO Quests are like orgasms:

    You may solo them, I prefer them in a group.

  5. Re:FrontPage? by gad_zuki! · · Score: 4, Informative

    >And we all know FrontPge went on to become the defacto standard for web development....that had to be fixed by an real web developer later.

    Do you want to democratize technology or just have it controlled by elites? Non-techies want to do things like scripting and web design without paying a professional, the same way they want to fix things around the house or fix the car. When it comes to small or easy jobs, a non-expert can do just fine. Why should we piss on the DIY'ers because they dont have a Master's degree in CS? Frankly, a lot of computer stuff is pretty easy and paying someone is ridiculous.

    While Im certainly no fan of Frontpage, I feel that it wasnt much worse than Mozilla Composer or other WSIWYG html composers.

  6. Re:FrontPage? by BobMcD · · Score: 4, Insightful

    Then we end up with Visual Basic(the birth of Visual Basic came with the motto "its so easy you know longer need programmers... managers can write the code"... that worked out well).

    From a business point of view, it actually did. People used VB, and particularly VB macros in Office, to do things that resulted in a lot of dollars flowing through a lot of organizations. Yes it did eventually need to be changed out, but in it's time, for it's purpose, you can't really fault it. It truly did work.

  7. Think executable step-by-step tutorials by tucuxi · · Score: 4, Insightful

    Sikuli is certainly not commercial-grade UI testing software. It was never intended to be, this is academic software written to explore ideas, rather than to polish them to perfection. Also, it is not a "general" programming language. The previous posters that compared it to video-programming are right: not all programs have to target complicated algorithms and data-structures, there is plenty of space for automating "simple stuff".

    As an idea, I find the readability of the code particularly interesting. Sikuli code is about the closest you can come to self-explanatory, step-by-step instructions on how to achieve whatever a particular program does. Add a few comments to the most arcane steps, publish those programs to an online repository, and presto! executable step-by-step tutorials.

    Yes, the developers may have to address the variability of themes on people's desktops. It is certainly possible to do so (for instance, by keeping a list of mappings from any of a set of "supported" themes to a "canonical" theme, which would be used in all examples), but, as far as ideas go, I really think that Sikuli is a very refreshing idea.

    1. Re:Think executable step-by-step tutorials by tristanreid · · Score: 3, Interesting

      I totally agree. I watched the youtube video (is WTFYV the equivalent of RTFA?), and I was kind of impressed. Although the demo shows an interaction with a bunch of buttons, the real power is the image recognition. She showed how with one command each you can script the two of the fundamental interactions you have with images on the screen: click it, or wait for it to appear. The fuzzy visual recognition algorithms are a huge plus. If you wanted to script something in your room using a web-cam, this is basically how to do it with trivial coding.

      I think of this as an equivalent to something like sql. There's a domain in which you'd like to impose logical structure (relational data / images), and you generally use the language to great effect in conjunction with another programming language. If I had to write a scheduled task for my laptop that needed for me to be on the VPN, I'd much rather use something like this to handle the connection rather than trying to figure out how the VPN API works.

      -t.

  8. Re:FrontPage? by AardvarkCelery · · Score: 3, Insightful

    Yeah, that's real easy for a programmer to say. Ever used a brownie mix? I'll bet a pastry chef would say, "I'd like to see people who wish to bake brownies actually learn how to bake brownies properly." Tools like Sikuli are the programming equivalent to brownie mix. It's easy gratification. (... or at least easier than learning to capture part of the screen and then do fuzzy image pattern matching on it.) If I were a very casual, light duty programmer, this would be pretty helpful sometimes.