Slashdot Mirror


Researchers Secretly Deployed A Bot That Submitted Bug-Fixing Pull Requests (medium.com)

An anonymous reader quotes Martin Monperrus, a professor of software at Stockholm's KTH Royal Institute of Technology: Repairnator is a bot. It constantly monitors software bugs discovered during continuous integration of open-source software and tries to fix them automatically. If it succeeds to synthesize a valid patch, Repairnator proposes the patch to the human developers, disguised under a fake human identity. To date, Repairnator has been able to produce 5 patches that were accepted by the human developers and permanently merged in the code base...

It analyzes bugs and produces patches, in the same way as human developers involved in software maintenance activities. This idea of a program repair bot is disruptive, because today humans are responsible for fixing bugs. In others words, we are talking about a bot meant to (partially) replace human developers for tedious tasks.... [F]or a patch to be human-competitive 1) the bot has to synthesize the patch faster than the human developer 2) the patch has to be judged good-enough by the human developer and permanently merged in the code base.... We believe that Repairnator prefigures a certain future of software development, where bots and humans will smoothly collaborate and even cooperate on software artifacts.

Their fake identity was a software engineer named Luc Esape, with a profile picture that "looks like a junior developer, eager to make open-source contributions... humans tend to have a priori biases against machines, and are more tolerant to errors if the contribution comes from a human peer. In the context of program repair, this means that developers may put the bar higher on the quality of the patch, if they know that the patch comes from a bot."

The researchers proudly published the approving comments on their merged patches -- although a conundrum arose when repairnator submitted a patch for Eclipse Ditto, only to be told that "We can only accept pull-requests which come from users who signed the Eclipse Foundation Contributor License Agreement."

"We were puzzled because a bot cannot physically or morally sign a license agreement and is probably not entitled to do so. Who owns the intellectual property and responsibility of a bot contribution: the robot operator, the bot implementer or the repair algorithm designer?"

41 of 87 comments (clear)

  1. Who owns a bot's intellectual property? by Anonymous Coward · · Score: 3, Insightful

    Who owns the intellectual property and responsibility of a bot contribution: the robot operator, the bot implementer or the repair algorithm designer?

    easy one: nobody

    Copyright applies to creative works. A machine produced work is not creative, since any similar machine could and would produce it.

    1. Re:Who owns a bot's intellectual property? by darkain · · Score: 3, Insightful

      This is great... in theory... Until it is actually tried in court, and they possibly rule a different way.

    2. Re:Who owns a bot's intellectual property? by Megol · · Score: 2

      One could argue that designer and/or programmer of the algorithm is providing the creative input. If the algorithm is based on machine learning it becomes a bit more difficult, can the selection of training data be considered creative?

    3. Re:Who owns a bot's intellectual property? by ShanghaiBill · · Score: 1

      One could argue that designer and/or programmer of the algorithm is providing the creative input.

      One could argue that, but there is very little legal precedent to support it. Once an algorithm is making something it is no longer "creative" in a copyrightable sense. A copyright to a software tool does not entitle the owner to copyright for the product of that tool, unless human created components such as libraries are included.

    4. Re:Who owns a bot's intellectual property? by Spamalope · · Score: 1

      The first multi-national company who decided they don't want anyone else to be able to use it.

    5. Re:Who owns a bot's intellectual property? by helpfulcorn · · Score: 1

      My thought is that perhaps one could argue a something akin to: if one writes or even uses a program to generate art in a random way, they could likely claim (IANAL) it's their creation, certainly sell it as their own and possibly sue someone who copies it in an unauthorised way. This may work especially well as an argument due to a common lack of understanding amongst lawyers, government, etc.

      As a thought though, I wouldn't try to argue that it'd belong to the creator of the program since anyone can run it and then suddenly that creator would own an insane amount of code.

    6. Re:Who owns a bot's intellectual property? by Anonymous Coward · · Score: 1

      Mod up.
      What do you think compilers do when you set options like 'Bounds Checking' and 'Type checking'.

      During Y2K - remember that?, dozens of skillful edit macros written in REXX modified millions of lines of COBOL code. We did not call it AI. We did not cal it a bot. The OpenBSD people wrote macros to fix sloppy coding and string length issues - sometime Microsoft does not claim yet.

      There were cross-translators that converted COBOL to C and other languages.
      Borland Turbo C and Foxpro set some standards not seen today. Elegant efficiency and small code that would pass first year university.

      It seems to me todays code does not check function return codes, and shitheads will not code other in a CASE statement because that causes 'work'. I cant comment on an opertaing system looking at a string declared as a variable - and thinking 'I will execute it because it looks like a URL!' That never happened in Cobol of Fortran.

  2. There's a lesson in this article. by Anonymous Coward · · Score: 4, Insightful


    During Expedition #1, whose results are presented in details in [7], Repairnator has analyzed 11,523 builds with test failures. For 3,551 of them (30.82%), Repairnator was able to locally reproduce the test failure. Out of 3,551 repair attempts, Repairnator found 15 patches that could make the CI build pass.

    Translation: Repairinator was able to fix .4% of the bugs it saw.

    A program repair bot is an artificial agent that tries to synthesize source code patches. It analyzes bugs and produces patches, in the same way as human developers involved in software maintenance activities. This idea of a program repair bot is disruptive, because today humans are responsible for fixing bugs. In others words, we are talking about a bot meant to (partially) replace human developers for tedious tasks.

    Instead of stating "Our goal is to enhance the performance of programmers" because that is what tools do; there are tons of businesses with sub-optimal solutions to their business process. Instead we use intentionally menacing speech.


    Their fake identity was a software engineer named Luc Esape, with a profile picture that "looks like a junior developer, eager to make open-source contributions... humans tend to have a priori biases against machines, and are more tolerant to errors if the contribution comes from a human peer. In the context of program repair, this means that developers may put the bar higher on the quality of the patch, if they know that the patch comes from a bot."

    Translation: We spent a fair amount of time lieing to people, justifying a means to an ends, not realizing lieing to people might cause them to not believe anything we say.

    Sounds like this guy is soon to be unemployed.

    1. Re:There's a lesson in this article. by gweihir · · Score: 1

      Hahahahaha, nice! This basically shows that automation is actually incapable of tackling this problem. It probably wasted more human time with the 10 bad patches than it saved by producing the 5 that got accepted (but are not necessarily good).

      --
      Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
    2. Re:There's a lesson in this article. by ShanghaiBill · · Score: 4, Insightful

      This basically shows that automation is actually incapable of tackling this problem.

      Not really. If a company spends $10M per year fixing bugs, and this tool fixes 0.4% of them, then it just saved $40,000. Not bad for a free automated tool.

      Also, the hard part of fixing bugs is not writing the patch, but replicating, locating and isolating the problem. If you can say "Here is a bug, here is how to replicate it, and it is occurring in THIS function", then that is a big help, and this tool was able to do that for over 30% of the bugs. That is huge.

      It probably wasted more human time with the 10 bad patches than it saved by producing the 5

      Not necessarily. If the code looks like it has a bug, then it likely needs to be refactored to make it more readable, even if there is no actual bug. It is not enough for code to be correct, it also should be clear so it can be read and maintained.

    3. Re:There's a lesson in this article. by citizenr · · Score: 1

      Translation: Repairinator was able to fix .4% of the bugs it saw.

      "fix"
      their bot commented out parts of the code that generated null pointer exceptions, its like commenting out half of Win95 kernel and calling it a fix.

      --
      Who logs in to gdm? Not I, said the duck.
  3. Disruptive? by stephanruby · · Score: 4, Interesting

    If the author was so confident in his bot, he would have attached his own name to it instead of making up a fake name for it.

    Also, I don't see why he thinks his idea is so novel, static analysis, for instance, can suggest solutions if you want. And if you're too lazy to double-check the work yourself and let someone else do it for you, that's not a great discovery, that's just laziness.

    1. Re:Disruptive? by slacka · · Score: 1

      Exactly, for example static analysis is great at catching copy/paste errors and will often show you the suggested fix. If I write a script that takes these results and submits a pull request, it's no different than if I manually take those actions. I,the script author would take credit, not Coverity or PVS Studio. This guys is out to lunch, thinking his tool is sentient and has legal rights.

    2. Re:Disruptive? by jameson · · Score: 1

      I'm not aware of any static analysis that can factor in unit tests (as this work does) to decide what does and what does not make a suitable patch. Note that these patches are about dynamic semantics, not just about name or type analysis failures (which might indeed be easy to fix purely by using static analysis).

      The reason for not attaching his name (or the name of the students working on the project) is to minimise bias. A patch coming in from a widely-published and highly experienced formerly-INRIA-now-KTH researcher might be viewed very differently than a patch coming in from a noob.

      The reason for `not just checking the patches themselves' is again to reduce bias, though you'll note that they did a manual sanity check every time before they submitted the patch. That is, they only submitted patches that they hoped wouldn't waste the devleopers' time after submission. Evaluating purely by manual validation means that the judgment of whether the result is useful relies purely on the honesty of the researcher, since most publication venues don't leave enough space for researchers to submit source code (nor sufficient time for the reviewers to analyse the source code and patches in detail, unless it's all pretty small and can be squeezed in with the rest of what one must describe in the typical 6-12 pages (double-column) or up-to-25 pages (single-column).

  4. Faster? by dbrueck · · Score: 2

    > 1) the bot has to synthesize the patch faster than the human developer

    I guess this is ok if it's faster in the sense of "the bot has to fix the bug faster than the human developer gets around to it", but in general I don't think this speed requirement needs to be that strict. If it's some latent bug waiting to hit us in production, for example, I don't care if the bot is slowly poring over code for days on end and brings a bug to my attention. Likewise, in any large project there are always tons of bugs that might not be that hard to fix, but it really is just a matter of getting around to them, so having some bot take a crack at them could be a good thing.

  5. How many were rejected? by Gravis+Zero · · Score: 4, Insightful

    What I'm reading is that yes, it made 5 patches that were accepted but the more important question is how many patches did it make total? If it made 800 patches and only 5 were accepted, that's kind of a problem.

    Also, there is good reason to distrust robotic submissions: there is no cognitive reasoning in generating patches. This means that it could very well make things worse rather than making them better. Sure, it could make your project build but it could also create an innocuous bug that breaks the code's functionality in the process which is likely to take even more time to correct because in addition to fixing the problem you also have to find it. Build failures already tell you where the problem exists.

    --
    Anons need not reply. Questions end with a question mark.
    1. Re:How many were rejected? by Anonymous Coward · · Score: 2, Informative

      What I'm reading is that yes, it made 5 patches that were accepted but the more important question is how many patches did it make total? If it made 800 patches and only 5 were accepted, that's kind of a problem.

      It made 15 patches, of which 5 were accepted and 10 were rejected.

      However, it tried to make 3,551 patches and only succeeded in 15.

    2. Re:How many were rejected? by TFlan91 · · Score: 3, Informative

      However, it identified 3,551 bugs, was able to automatically fix 15, and notified a human of the rest.

      FTFY.

      In that case, this sounds totally worth it. That's 3,551 bugs that QA or the client don't have to run into.

    3. Re:How many were rejected? by Raenex · · Score: 2

      In that case, this sounds totally worth it. That's 3,551 bugs that QA or the client don't have to run into.

      No, those were failures via automated tests. Meaning they were already known about. All this program did was provide a handful of fixes after trawling through thousands of known failures. Big whoop.

    4. Re:How many were rejected? by Anonymous Coward · · Score: 1

      In fact of those 15 patches, none of them got accepted or were found of low quality (i.e. not valid patches at all). They call that "expedition #1".
      Then they run it again a bit later, expedition #2, and got 5 patches accepted. They don't declare the same comparative figures as for the first run.

      Also they claim their system can fix 30 bugs a day. I wonder how they got to that number.

      All in all it's all hog wash. Interesting yes, yet still hog wash.

    5. Re:How many were rejected? by nasch · · Score: 1

      it could also create an innocuous bug

      So could a human, and it happens all the time. That's what testing (automated and manual) is for.

      Build failures already tell you where the problem exists.

      This bot was fixing build failures?

    6. Re:How many were rejected? by Gravis+Zero · · Score: 1

      So could a human, and it happens all the time. That's what testing (automated and manual) is for.

      True enough but there is no cognitive design or debugging occurring, just code mutation, rebuilding and testing. In six months (and thousands of attempts later) it managed to make 15 patches, five of which where accepted. Some (if not all) of the five that were accepted needed to be modified as well.

      This bot was fixing build failures?

      I got that part wrong. From what I read, it's fixing unit testing failures.

      --
      Anons need not reply. Questions end with a question mark.
    7. Re:How many were rejected? by nasch · · Score: 1

      From what I read, it's fixing unit testing failures.

      Well there you go. If the unit tests all pass after the fix, and test coverage is acceptable, and there's a QA program to test the kind of thing unit tests don't cover, I don't see the issue. It's true that the entity that created the bug fix didn't understand what it was doing, but a human (the person who accepted it) presumably did understand it, and it passes all the tests that would be required of a bug fix from a human. Either the testing is adequate, or it isn't. If it is, then we can be assured the bug fix was adequately tested. If it isn't, you're rolling the dice whether the fix was from a human or a bot.

    8. Re:How many were rejected? by Gravis+Zero · · Score: 1

      Either the testing is adequate, or it isn't. If it is, then we can be assured the bug fix was adequately tested. If it isn't, you're rolling the dice whether the fix was from a human or a bot.

      You are correct. It's worth noting that it attempted to fix unit tests on 3000+ projects over six months, so it may be that poorly designed unit tests are what enabled it to "succeed" in making a patch for 15 projects.

      --
      Anons need not reply. Questions end with a question mark.
  6. Re:Programmers are obsolete by dbrueck · · Score: 2

    How often is that claim really made? And if ever, how often is it actually made by developers? The only time I've ever heard that claim put forward is like you did: right before shooting it down.

    Automation is part of what makes development fun (there's a certain thrill to replacing something tedious with a "machine" that does it for you).

  7. The robot operator owns property & responsibil by Anonymous Coward · · Score: 1

    How is this even a question?

    If you go and apply image filters to pictures on Flickr using SoftwareA... that doesn't mean SoftwareA is responsible for what you did or now owns what you produced, nor do SoftwareA's developers.

  8. Re:Programmers are obsolete by mrbester · · Score: 1

    I don't recall when hammers could make hammers by themselves. Did they only appear in a Pink Floyd video?

    --
    "Wait. Something's happening. It's opening up! My God, it's full of apricots!"
  9. Unnecessarily complex by skoskav · · Score: 1

    A more pragmatic approach for a project may be to run static code analysis on each commit in order to highlight potential bugs and vulnerabilities. Creating a bot that waits for tests to fail and swoop in with a patch based on such analysis then seems like a mostly redundant addition.

  10. Re: Programmers are obsolete by Zero__Kelvin · · Score: 1

    He didn't say it in the right way, but he meant that he believes machines will program themselves one day.

    --
    Guns don't kill people; Physics kills people! - John Lithgow as Dick Solomon on Third Rock From The Sun
  11. Very bad idea by gweihir · · Score: 1

    In ordinary patch submission, there are two instances with actual intelligence and understanding: The patch creator and the maintainer. Here, there is only one: The maintainer. This violated the 4-eye principle. If the maintainer makes a mistake, the most stupid (in a non-obvious way) code makes it into the software.

    Automated tools should never be used to decide anything. They should always only provide input to a human expert that knows exactly how the input was created and that there is no intelligence in that mechanism.

    --
    Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
  12. Re:Programmers are obsolete by gweihir · · Score: 1

    Most developers are at risk of losing their jobs if anybody realizes how bad they actually are. Automation would just be the thing that shows that, not the thing that replaces them. Good developers will not get replaced until we have working strong AI, which is not happening any time soon. (A senior member of the IBM Watson team put it as "certainly not in the next 50 years" to me.)

    --
    Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
  13. Re: Programmers are obsolete by dbrueck · · Score: 1

    Eh, he's claiming that some broad segment of "developers" claim something and that this is evidence they are wrong.

  14. Impersonating a bot by Morgaine · · Score: 1

    stephanruby wrote:

    If the author was so confident in his bot, he would have attached his own name to it instead of making up a fake name for it.

    It would be unethical for the human to impersonate a bot.

    What's more, the bot has no means to give the human legal authority to impersonate itself. Conundrum! :P

    [Oh dear. By the time I got to the end of this post, I began to realize that I was no longer quite so sure that I was joking.]

    --
    "The question of whether machines can think is no more interesting than [] whether submarines can swim" - Dijkstra
  15. Look at the patches themselves by craighansen · · Score: 2

    I went to the trouble to look at the patches themselves, and they appeared to be lacking any documentation of which test case that was failing was fixed. If the bot did a better job of documenting the rationale for the patch, perhaps they'd get a better acceptance rate. (And by the way, the acceptance rate wasn't reported in the article, only that 5 patches were accepted over a 6 month period. - Was that 5 out of 6, or 5 out of 100?)
    Otherwise, I'd think that a program-generated patch, that indicated that it fixed a failing test case and didn't cause any additional failures in the test suite, and clearly indicated that it wasn't claiming proprietary rights to the patch would be generally welcome. I don't see the need for any secrecy.

  16. Researchers are obsolete by PolygamousRanchKid+ · · Score: 2

    Researchers Secretly Deployed A Bot That Submitted Bug-Fixing Pull Requests

    AI Bots Secretly Deployed A Researcher That Submitted This Research Paper

    --
    Schroedinger's Brexit: The UK is both in and out of the EU at the same time!
  17. Re:Programmers are obsolete by HiThere · · Score: 1

    Yeah, I remember that. It could handle choosing columns from a table if you told it which column ahead of time.

    That doesn't prove this is the same kind of thing, but PR flacks aren't any more moderate now than then, so it could well be.

    --

    I think we've pushed this "anyone can grow up to be president" thing too far.
  18. Re:Programmers are obsolete by dbrueck · · Score: 1

    How often is that claim really made? And if ever, how often is it actually made by developers? The only time I've ever heard that claim put forward is like you did: right before shooting it down.

    Automation is part of what makes development fun (there's a certain thrill to replacing something tedious with a "machine" that does it for you).

    Frequently here on slashdot whenever people speak of automation costing jobs. Inevitably some developer will claim that they won't be the ones being ousted.

    Are you sure? For kicks I checked a few stories (https://hardware.slashdot.org/story/18/04/24/2333259/a-study-finds-half-of-jobs-are-vulnerable-to-automation, https://hardware.slashdot.org/..., and https://slashdot.org/story/18/...) and didn't find really anything at all to suggest that "software developers claim they are immune to automation".

    In fact, if comments on those articles is any indicator, on slashdot it's typically the software developers who are in the "anything can be automated" camp. And really that makes sense - who is more likely to see where automation can go, the people who have decades of first hand knowledge and experience of progress in automation (software developers) or people who have relatively little direct experience with automation in their day to day job (a lawyer or an author maybe)?

    Maybe you had a prior bad experience with a deluded dev or two, but I think what you said was overly broad - or maybe my anecdotal experience is just the exact opposite of your anecdotal experience. ;-) Have a good one!

  19. hackers worldwide just got a great idea by RhettLivingston · · Score: 1

    How long will it be now before they make bots to submit patch requests that insert vulnerabilities - likely under the guise of fixing a true vulnerability and just making a mistake in doing so?

  20. Self patch by GrahamJ · · Score: 1

    Pointing such a bot at its own codebase is how skynet happens.

  21. What Ethics Committee let this pass? by Pimpy · · Score: 1

    Following up on the links from the article, it's clear that the professor in charge only went out and informed those who accepted the pull requests more than 10 months after the fact of the situation, under the guise of "full disclosure" (and these are of course for links he has only linked directly, we have no information about the failed pull requests). While one of the areas they wished to focus on was inherent discrimination against bots, he appears to have missed the point that automation, while able to provide simple fixes to obvious problems (predominately in error handling flows), they have a terrible track record of understanding nuance or interaction subtleties during run-time. Coccinelle went through a lot of this with the Linux Kernel as well and is something that ultimately received a great deal of acceptance as it matured - although in this case, patches were generated automatically and submitted by humans. I wonder then what the objective of this research is, to show that software developers are sceptical of contributions from automated tools that follow semantic rules more than logical reasoning, or that by hiding behind a fake identity representative of a demographic that is more likely to submit trivial changes you are able to hoodwink the developers into accepting your submissions? Both get an ethics fail, and it's not clear that there's anything even novel to show for it.

  22. Improvement over regular code analysis tools? by Dirk+Becher · · Score: 1

    How can this perform better than regular code analysis tools who already inform you of nullptr accesses, unused variables etc. ?