None of Your Pixelated or Blurred Information Will Stay Safe On The Internet (qz.com)
The University of Texas at Austin and Cornell University are saying blurred or pixelated images are not as safe as they may seem. As machine learning technology improves, the methods used to hide sensitive information become less secure. Quartz reports: Using simple deep learning tools, the three-person team was able to identify obfuscated faces and numbers with alarming accuracy. On an industry standard dataset where humans had 0.19% chance of identifying a face, the algorithm had 71% accuracy (or 83% if allowed to guess five times). The algorithm doesn't produce a deblurred image -- it simply identifies what it sees in the obscured photo, based on information it already knows. The approach works with blurred and pixelated images, as well as P3, a type of JPEG encryption pitched as a secure way to hide information. The attack uses Torch (an open-source deep learning library), Torch templates for neural networks, and standard open-source data. To build the attacks that identified faces in YouTube videos, researchers took publicly-available pictures and blurred the faces with YouTube's video tool. They then fed the algorithm both sets of images, so it could learn how to correlate blur patterns to the unobscured faces. When given different images of the same people, the algorithm could determine their identity with 57% accuracy, or 85% percent when given five chances. The report mentions Max Planck Institute's work on identifying people in blurred Facebook photos. The difference between the two research is that UT and Cornell's research is much more simple, and "shows how weak these privacy methods really are."
For a computer, most algorithms behind comparing two pictures is already a blurred picture of both. Most of these algorithms take samples/pixels of the pictures and see if the relationships of both sets of samples are the same or within a margin of deviation. There is little value in comparing pixel by pixel for exact matches. Similar to human finger prints.
A blurred picture is similar to taking less samples on one picture and setting the margin of deviation wider.
But for computers, 57% is pretty bad. 85% is also very bad and that's when you are telling the machine the answer. At those rates, this is kind of hard to do mass comparisons... the false positives would be far too high for any human to weed through. This will apply more for targeted searches where an investigator wants the 5 most probable matches to a blur. Unlike the researchers here who know the answer before hand, he still needs to take the guess on which one it actually is.
In a criminal investigation, if we had a database of likely suspects, this would work. But we are all about mass collection of data data data. With a large population of pictures, the blur will probably match a lot more than 5.
It is a fundamental law of computer science that you cannot increase the amount of information in a given dataset. In this case the combined dataset of the blurred image and the learned statistical averages of a human face.
Once an image has been blurred (information has been deleted) it cannot be recreated. What you can do is to apply statistical averages in the hopes of getting something which might resemble the original information. It will - however - be just that, cosmetic improvements based on statistical averages.
If sufficient information has been removed by blurring the image, the deblurring process - no matter if you use the word AI or statistic averages - cannot recreate a uniquely identifiable image.
The guy was caught in Thailand. The German police "deswirled" his photograph:
https://en.wikipedia.org/wiki/Christopher_Paul_Neil
"Deep learning" is a configuration of a neural network. Historically we couldn't have nested neural networks because we didn't know how to train them in any reasonable amount of time. Then we figured out how, and discovered nested nets work far better than traditional neural networks.
So you get more specific and descriptive going from: algorithms -> AI -> reinforcement learning -> neural networks -> deep learning.