Ubisoft is Using AI To Catch Bugs in Games Before Devs Make Them (wired.co.uk)
AI has a new task: helping to keep the bugs out of video games. From a report: At the recent Ubisoft Developer Conference in Montreal, the French gaming company unveiled a new AI assistant for its developers. Dubbed Commit Assistant, the goal of the AI system is to catch bugs before they're ever committed into code, saving developers time and reducing the number of flaws that make it into a game before release. "I think like many good ideas, it's like 'how come we didn't think about that before?'," says Yves Jacquier, who heads up La Forge, Ubisoft's R&D division in Montreal. His department partners with local universities including McGill and Concordia to collaborate on research intended to advance the field of artificial intelligence as a whole, not just within the industry.
La Forge fed Commit Assistant with roughly ten years' worth of code from across Ubisoft's software library, allowing it to learn where mistakes have historically been made, reference any corrections that were applied, and predict when a coder may be about to write a similar bug. "It's all about comparing the lines of code we've created in the past, the bugs that were created in them, and the bugs that were corrected, and finding a way to make links [between them] to provide us with a super-AI for programmers," explains Jacquier.
La Forge fed Commit Assistant with roughly ten years' worth of code from across Ubisoft's software library, allowing it to learn where mistakes have historically been made, reference any corrections that were applied, and predict when a coder may be about to write a similar bug. "It's all about comparing the lines of code we've created in the past, the bugs that were created in them, and the bugs that were corrected, and finding a way to make links [between them] to provide us with a super-AI for programmers," explains Jacquier.
Just what we need: autocorrect/autoformatting on par with MS Word.
Why not add clippy to your IDE while you're at it?
I remember it being discussed in my software engineering class that trying to automate bug removal or detection could be shown to be isomorphic to solving Turing's halting problem.
File under 'M' for 'Manic ranting'
The more things change, the more they stay the same....
Anybody else remember LINT? I used to work a project that required that all compiler warnings be dealt with and anything reported by LINT was documented and explained IN THE CODE. It certainly didn't catch everything but it sure kept the code consistent and common logical issues from appearing too often.
Now off my lawn....(snicker)
"File to fit, pound to insert, paint to match" - Aircraft Maintenance 101
What does AI think about that, that is the question!
I've use a technique of investigating every error I find during code reviews or debug sessions to determine if the programmer has made the same error pattern in other code. If they have, I add detection of the pattern to a perl-based scanner that runs on code commit. The method has been extremely successful, partly because I am gifted with the use of regular expressions and LL parsers. It is not a method I have been able to teach others how to utilize despite many efforts to do so. Noone else on my teams has had the same instincts for automation of pattern detection. Consequently, it has been one of the unique offerings I've been able to bring to the bargaining table for much of my career.
It is good that I only have another 15 years or so to retirement. I might just make it.
AI: "Based on years of historical pattern matching, your commit has been flagged as 'needs review' for the following reasons:
1 - First name of developer is 'Fred'"
If the dev AI/Automation is getting that good. Why are they not just doing a majority of the game development?
;)
;)
;)
I always hear the horror stories about how bad AI and Automation will affect burger flippers, etc.
Seems to me the real jobs that will be affected are the white collar elites jobs. Bankers, Brokers, Developers, Judges, Lawyers, Programmers, etc. Imagine white collar professions minus the bias/corruption/prejudice because it is all just honest AI & Automation
Oh yea, honest AI/Automaton politicians, Sweet
Just my 2 cents
Brown ball means bug we can fix after the release, red ball means we can't ship until the precog AI is made happy.
I assume Ubisoft doesn't use OOP much?
It's coming soon - "Computer - write me a *Nix kernel."
The days of the software developer will be gone someday.
I mean, how many of us still program computers with patch cords?
Maybe this AI can finally convinced them to give an option to NOT have the elevation angle of driving cameras try to reset to default after a couple of seconds? Drives me nuts in all their games, e.g. Watchdogs and Ghost Recon: Wildlands. Especially important if driving a taller vehicle.
Says it can detect 6 out of 10 bugs? Based on what? How would it even know what a "bug" was in the context of the game? The video also says it creates code signatures for bugs, but doesn't explain how. They explain the concept, but does it actually work? What specifically does it do? Without seeing examples, it's hard to imagine this tool does what they seem to be saying it does.
this is ..... not like LINT. Remember LINT? LINT is still in widespread use. This would be a complimentary tool to catch higher level bugs based on code heuristics in valid correct code beyond the safety checks that LINT looks for. This is not for doing what LINT does, and what LINT does is still very useful.
"Old man yells at systemd"
auto QA testing can just fail silently or pass in a way that any real person will see it as an error.
Also poor ui's / control setups are bugs that some auto system can not find as well but QA testers will find.
I'm not a game developer, but from my understanding a game "dev" does very little programming in the traditional sense. Most games use pre-made engines such as unreal engine, unity, cryengine, etc. These guys are mostly doing scripting, world building, animation, modeling, progress triggers, hit boxes, etc. So assuming I'm correct on that, Lint would be useless for this. In that case, they've created Lint for another problem space.
Yes, indeed. We realized a lot of those compiler warnings were actually trying to tell us something (WOW!) and cleaned up our code base over the course of a number of years. We are now pretty much at the point where compiler warnings are generally viewed as errors, so the bar is very high and requires additional code review if you legitimately need to submit something that triggers a warning ... "the compiler is lying, I'm right" isn't good enough. We have a number of people on staff who are really good at figuring this out before it ships and shows up as a bug. It's also far more reassuring when you see your code compile really cleanly, makes a lot of other stuff far easier.
It's also worth noting that there's plenty of stuff you can do when you are checking in code if your organization has come up with code guidelines. Checking for things like the present of tabs, copyright strings and a number of other things can be enforced.
But it is the same concept as LINT.... It may be different from LINT but this really isn't a new idea, and it IS LIKE LINT...
There have been static code analysis tools in use for decades now, LINT was among the first of these tools to find wide spread use and many have followed in its foot prints. This is NOT a new idea, even if the implementation method varies from the C program the initial LINT was/is.
Read the Mythical Man Month.... There is nothing new... Each generation thinks their stuff is better, that they found the way, the silver bullet. I've watched a couple of generations of new programmers go though the same "Eureka! I've got it!" claims, only to discover the hard way what I discovered, the devil is in the details and there is no magic bullet.
"File to fit, pound to insert, paint to match" - Aircraft Maintenance 101
How am I supposed to write myself a new minivan now?!
Anons need not reply. Questions end with a question mark.
Developers' abilities to create new and exciting bugs will continue to dominate until AI learns to code by itself.
Have gnu, will travel.
Just like their game trailers this will be a highly polished original idea that's looks like a game changer. Sadly when it's released reality will set in.
It is so "eightyish".
LINT is something that looks at your code for problems and reports them.
A compiler is something that looks at your code for problems and reports them (and also generates output).
This ubisoft tool is something that looks at your code for problems and reports them.
Your colleague standing over your shoulder is something that looks at your code for problems and reports them.
Your colleague who does code-review is something that looks at your code for problems and reports them.
Your copy-editor is someone who looks at what you've written for problems and reports them.
Your spell-checker is something that looks at what you've written for problems and reports them.
Your grammar-checker is something that looks at what you've written for problems and reports them.
It's kind of silly to say "this is LINT!" or "there is nothing new" as regards this story. The task of having feedback on what you've produced is at least as hold as humankind itself and has existed in every area of human endeavor, in a huge variety of ways, with feedback performed at a huge variety of times. Of course that aspect isn't new. Beyond that, this Ubisoft is as (un-)related to LINT as every single other tool I listed.
Still no cure for Ubi's stupid design choices.
semantics are everything!
See, AI is replacing what we (consumers) used to do, especially for companies like Ubisoft. What do they expect me to do now, play the game without issue and enjoy myself? What am I going to complain about now?
Amazing!!! You mean those warnings actually mean something sometimes? And that a lot of the bugs we make were actually known about years or even decades before we made them?!?! I'm laughing with you.
I've known a lot of "senior-level developers" who don't even seem to think the warnings mean anything and they certainly don't believe in code guidelines. Funny stuff.
if they find more problems in the "hard" code or the "easy" bits.
I have a hypothesis that the easy sections cause more problems for a couple of reasons:
1. Inexperienced developers tend to get the less demanding code.
2. Nobody takes easy jobs seriously, so they are more likely to assume something conceptually simple is correct.
3. Obviously "hard" code gets more rigorously tested.
Obviously I don't know this is the case but I would like to. For me the stats from the teaching database are more interesting than the machine learning system they fed it to.
delete this
"I'm sorry. I can't do that, Dave."
Yea, I used to say that kind of thing when I was young too. Hubris lives on in the young. We had all the good ideas then too, we where better educated, fresh out of school and full of promise. But we where as stupid as those who came before us. Wisdom is hard won though experience and I've personally learned the grey beards of my day where right, there is really nothing new. Programming remains the same problem, though the names and faces have changed.
Face it.. At the very best, this is just an extension of LINT and just static code analyzer like LINT was/is. They've been doing this kind of thing long before I came into this field fresh out of school and when you are old and grey, the newbies will "invent" the same thing AGAIN, and you can dig up this post and quote it.
"File to fit, pound to insert, paint to match" - Aircraft Maintenance 101
You should read the article.
It has absolutely nothing to do with LINT and is not the same concept.
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
No, it is not a static code analyzer.
You seem to old to grasp new concepts.
Why don't you read the article instead of continuing to make an idiot out of your self?
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
We are now pretty much at the point where compiler warnings are generally viewed as errors, so the bar is very high and requires additional code review if you legitimately need to submit something that triggers a warning ... "the compiler is lying, I'm right" isn't good enough. We have a number of people on staff who are really good at figuring this out before it ships and shows up as a bug.
Highest warning levels in most compilers are reserved for speculative belching's about things the compiler has no means of understanding. This output is generally worth little nothing more than occasional browsing.
You would do a lot better with static analysis software than wasting time "process whoring" high level compiler warnings.
It's also worth noting that there's plenty of stuff you can do when you are checking in code if your organization has come up with code guidelines.
Checking for things like the present of tabs, copyright strings and a number of other things can be enforced.
Sounds like your typical obsessive compulsive pedant who cares more about scratching itches than producing shit customers can use and benefit from.
This sounds more like Clippy.
"I see you are trying to write a state machine..."
const int one = 65536; (Silvermoon, Texture.cs)
SJW, n: "Someone I don't like, and by the way I'm a fuckwit" - AC
For any given program f running on Microsoft Windows it will halt if you let it run long enough.
Hi! I make Firefox Plug-ins. Check 'em out @ https://addons.mozilla.org/en-US/firefox/addon/youtube-mp3-podcaster/
Sounds like Ubisoft are slowly catching up to mainstream development practises. Even MS has for years been using historical bugs to feed into code analysis tools to find and stop errors, hence why you tend to get big spurts of patches from them as new types of bugs are discovered they get fixed and then fed into the front end to try and prevent them from happening again
Do people really not use -Wall -Werror (or equivalent) by default? Aside from stuff being put on github which apparenrtly has a requirement to spew megabytes of warnings when compiling...
Not a bad thing. That's actually quite useful. State machines and dates and other things seem to be items that programmers always re-invent, despite there being a half dozen of them in the libraries they already use.
Imagine the usefulness one could have if the IDE simply stated "it appears you're implementing a date function. Have you considered the date and time APIs already available to you? Here's some APIs and documentation."
This is especially true for languages with rich APIs that do everything, yet everyone seems to reimplement them wrongly anyways.
Want to impress me further? See to it that it sees I implement bubblesort, but it stays quiet if it knows the data set I'm working with is small. If I make it bigger, it then flags it. (Bubblesort is perfectly fine as a sort algorithm for small datasets - like say, 10 elements)
Few years from now, this AI system can link certain kids of coding errors for a particular developer to his or hers arrival time to the workplace and their selection of clothing from the security camera feeds.
Good to have Auto QA though to free humans for what they're good at. Also good to have auto-play-testing with AI agents just running against walls etc. If any AIs get stuck or position.z -100 and have fallen out of the game level then you can replay their game route.
Save your play testers for quality not random breaking of the game.
What happens when the AI identifies Ubisoft as the bug?
I believe that in the movie, they started using AI to lint their code, then it became self aware and just started writing the code and finally gave rise to the terminators which put all humans into a simulation to harvest their bioenergy. Or something like that...
Let me try again with a car analogy. Someone designs and builds a Tesla. You come along and say "Face it. At the very best, this is just an extension of the '87 Ford F150. There's nothing new. They've just re-invented the '87 Ford F150. Oh the hubris of the young, to think that they could do anything different."
What's weird is that your "prior art" is (1) oddly specific almost in ignorance of the rest of the field that came before and after, (2) misses the point that Tesla has taken one existing technology "battery" and put it into another existing technology "electric vehicle" in a well-designed package that works well and is interesting for these reasons.
Oh, grasshopper... This is not new. It's doing static code analysis and flags parts that may have issues based on where trouble has been seen in the past. This is EXACTLY what LINT is, advice from experience. It's basically saying "Um, you *might* not want to do this kind of thing because it's often a mistake. Are you sure?"
Of course you somehow think that because it's some fuzzy AI technique used it's somehow different? Cute....If anything, it's less effective being AI, but that's another debate you won't understand.
"File to fit, pound to insert, paint to match" - Aircraft Maintenance 101
Isn't that what they want? No game but all money?
Ok, idiot.
It does not do static code analysis.
You still have not read the story or the linked article.
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
I like you analogy, but consider this small addition...
Let's say I tell you that we discovered a whole pile of rules for designing cars. Don't try to stamp metal into this kind of shape, mount your glass in ways the flexing of the body doesn't break it, don't use plastic for this kind of part or that, don't use 6V lamps or non-rechargeable batteries, don't put the gas tank too close to the bumper or put electrical wires and exhaust manifolds near it either.... We use these "rules of the road" to validate our design changes as we make them.
You, invented an AI way to look for this same list in your vehicle designs....
The concept of that Tesla isn't new, a battery operated car you can drive on the roads.. Only the implementation is new. Tesla didn't invent the electric car, they just implemented one.
This 'new' program was invented back in the days of LINT, this is just a new implementation of one. It's not even better than LINT if you ask me, but being AI based, there's no way to prove that objectively.
"File to fit, pound to insert, paint to match" - Aircraft Maintenance 101
Ok, idiot. It does not do static code analysis.
From the above article I shall quote:
"It's all about comparing the lines of code we've created in the past"
Um... "Static code analysis" is looking at the source code for probable errors. This program does that.
You may have read the articles in question, but you obviously don't understand what you saw there.
"File to fit, pound to insert, paint to match" - Aircraft Maintenance 101
No, it is not a static code analyzer. You seem to old to grasp new concepts. Why don't you read the article instead of continuing to make an idiot out of your self?
From the Article above:
"It's all about comparing the lines of code we've created in the past"
That sure sounds like "static code analysis" to me. Perhaps you don't understand that that term means?
Static Analysis of source code is looking at the source code for interesting patterns, in most cases looking for common programming errors. LINT did this, this "new" program does the same thing. It looks at the source code for patterns right? Then it's doing static analysis.
"File to fit, pound to insert, paint to match" - Aircraft Maintenance 101
I doubt that's how it'll work out. It isn't 'AI', and though tendencies can be noted, the future isn't so linear, everything can turn in a dime, that it can be predicted by anything, let alone a glorified algorithm. Meh.
From the Article above:
"It's all about comparing the lines of code we've created in the past"
That sure sounds like "static code analysis" to me.
It IS static code analysis. But that doesn't mean it is "like Lint".
Lint uses a table of HUMAN GENERATED patterns. These patterns are labor intensive to produce, and only find bugs that humans thought to check for.
This new checker looks at a steadily expanding database of bugs, and the fixes for those bugs, and LEARNS THE PATTERNS ON ITS OWN. This means it can have a much bigger set of patterns, including many that a human might have never thought to include. It also means that the system can steadily improve.
Most likely this system will be used as a supplement for Lint, rather than a replacement. But as it learns and improves, it may make Lint no longer very useful.
Kudos for implementing quality control measures. But you cannot overcome the following poor "design" choices with this tool:
- An insistence to ship a flagship game before the holiday season to meet financial numbers, no matter if its ready or not (see Assassins Creed Unity)
- An initial upfront gimmick followed by horribly designed gameplay/missions when compared to your competitors (see Watch Dogs).
- Always on DRM requirements of your online platform (see uPlay)
etc.
Yes, but it also means that it will find patterns like, "Josh always does this, Josh writes lots of bugs, this might be a bug." I mean, maybe just fire Josh?
Want to impress me further? See to it that it sees I implement bubblesort, but it stays quiet if it knows the data set I'm working with is small. If I make it bigger, it then flags it. (Bubblesort is perfectly fine as a sort algorithm for small datasets - like say, 10 elements)
So, there's this thing called OO programming...
I mean, maybe just fire Josh?
Staffing decisions are outside the scope of the tool's remit.
... and how does it compare to existing static code analyzers? It's not that static code analysis is a completely new technique.
There has been plenty of long-term research on this topic under the name "defect prediction". It is actually one of the best (or few..) suited application areas in the software engineering processes for machine learning. You get an obvious set of training data. Record all the bugs found, issues reported, match them to fixing patches (or whatever you call it, fix locations to code in practice). Label those places/parts of code as "error-prone" before the fix. Maybe take the "after" part as an example of non-issue. Train your machine learning classifier.
It's been done for a long time. Much like most semi-academic application areas, this is again done with university-company collaboration. Because no company on itself ever wants to get into this type of stuff due to seeing little real benefit beyond your basic, easily available and applicable, analyzers, and writing those (unit) tests that you should write anyway. In similar way to topics such as "regression test optimization". Decades of slapping on it, minor improvements if any, but with creative reporting, always some Uni finding a "partner" to get funded and help them put the academics to little bit of practice, as it otherwise is not so profitable.
So here they found a better known company (Ubisoft) and decided to call machine learning AI, brilliant hype generated. Now lets have a project meeting and report to our funding agency about how we made world-wide news with our super-AI tech and globally big companies. OMG have many more millions you so good. Because the funding agency will not understand jack about what is going on there, and if they do, they will still be impressed someone made something working in some way. Which sadly is a true achievement in that context. ...
"catch bugs before they're ever committed into code"
What?
WHAT?
They catch the bugs in the programmers' heads before they fucking type it?
Compiler warnings are often viewed as "noise" and disregarded entirely, or "logged and fixed in the next release". We learned our lesson the hard way years ago and have moved to warning-free code as a checkin requirement, but it would not surprise me to find a lot of organizations with date-driven releases who let them slip, especially as the ship date get ominously close. We pretty much use -Wall -Werror, but occasionally I need to deal with stuff from github and it's warning city. It's kind of like learning hygene is really good and then having to go somewhere soap hasn't been invented.
It is not _code analysis_, hence it can not be _static code analysis_
But perhaps you only want to nitpick with words.
The tool knows nothing about the code or the programming language.
It only knows I fixed a bug in line 100 of a text file.
And bottom line you can use the same technique for editing books. "The lector says 'this phrase is bad', the author changes it to 'this sounds better'". The AI checks for similarities in the rest of the book and flags them.
In other words: it knows nothing about the topic it is looking at.
LINT knows a lot about the language it is linting, after all it uses the same parser and AST as the compiler.
Historically you had the compiler and lint because machines had not enough memory to have in depths semantic analysis AND code generation in one program. So they wrote two programs, which share most their code. One does an in depths analysis about the code and outputs warnings (warnings that often _are_ errors). The compiler only wants to compile and emit code. It does not care about many things. Which is also considerably faster on classic machines (machines with a few KB of RAM, e.g. PDP-11 and Apple ][, yes, I programmed in ancient C on Apple ][s - I believe Aztec C)
So, a LINT and the approach of Ubisof: have absolutely nothing in common
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
If this tool doesn't understand the programming language of the code, Then... It's pretty much worthless compared to LINT. Shall we send all our code to a group of English grammar experts for their comments before we allow it to be committed too? What's the point of that?
In any programming language there are commonly used sequences which are prone to errors, which are syntactically correct, but likely wrong. Your AI based search would produce lots of useless garbage if it cannot tell the difference between a comment and actual code.
I'm going to say it only one more time. Static code analysis (looking at the source code) is what LINT did and does. Your brand "new" AI program, does the same thing. If it works as you claim, it likely also produces a lot of false positives OR if the thresholds are set high, misses a lot of common mistakes and you'd be better off tossing the AI part out and just hard code it, like LINT did....
"File to fit, pound to insert, paint to match" - Aircraft Maintenance 101
This is called static analysis and has been around for YEARS. Tons of companies are doing this.
GCS/MU/P d- s:- a-- C++++$ UL++ P+ L++ E+ W++ N o K- w--- O M+ V- PS+++ PE Y+ PGP t+ 5- X R++ tv+ b++ DI++ D++ G+ e++ h-
I'd call the made-for-VR movie, "Do Androids evolve to become veterinarians specializing in sheep?"
"There is no god but allah" - well, they got it half right.
If you think it does the same thing then dream on ...
Cost free eBook I read (by iBook/Kobo/Amazon/ObookO/Gutenberg etc.): "The Green Odyssey" by Philip Jose Farmer.
from my understanding a game "dev" does very little programming in the traditional sense
I work on a VR project at my day job (basically a video game). Your statement that we use a "pre-made" game engine is basically like saying that nobody does any "real" programming if they use a library or some code that someone else wrote. In other words: yours is a no-true-scotsman statement.
These guys are mostly doing scripting
We used both C++ and Unreal Script when we were using UDK (Unreal Engine 3). But everything we do now with Unreal 4 is C++ (and some Blueprint stuff to hook the GUI together).
world building, animation, modeling, progress triggers, hit boxes, etc.
Yes, that's the problem domain. And how do you think these things get implemented? With code.
So assuming I'm correct on that, Lint would be useless for this.
You're not really correct on all that. Also Lint and other static analysis tools are useful. John Carmack has some great insight into all this and how static analysis tools can help and fit into game development. As evidence: the Doom 3 source code is pretty generally very good.
But, didn't we once address this issue by doing unit testing and even limited end-to-end testing, often automated, before checking code in? Didn't we also publicly shame those who introduced bugs into source control, and fire people who regularly broke the build?
Seems like using AI is an overly-complicated, likely-fragile solution to a very simple problem.
You know what the difference between engineering and science is? A scientist will spend $10MM to develop a pen that will write in space. An engineer, a good engineer, will buy a 20 cent pencil to achieve the same goal. (No, I am not a Russian troll. Just think that story is a great example of ego vs. outcome based decision making)
That's one of the reasons I left the tech industry. It's a science culture, not an engineering one. You all are more interested in proving what gigantic brains you swing than in achieving outcomes.