Slashdot Mirror


Google's Latest Machine Vision Breakthrough

mikejuk writes "Google Research recently released details of a Machine Vision technique which might bring high power visual recognition to simple desktops and even mobile computers. It claims to be able to recognize 100,000 different types of object within a photo in a few minutes — and there isn't a deep neural network mentioned. It is another example of the direct 'engineering' approach to implementing AI catching up with the biologically inspired techniques. This particular advance is based on converting the usual mask-based filters to a simpler ordinal computation and using hashing to avoid having to do the computation most of the time. The result of the change to the basic algorithm is a speed-up of around 20,000 times, which is astounding. The method was tested on 100,000 object detectors using over a million filters on multiple resolution scalings of the target image, which were all computed in less than 20 seconds using nothing but a single, multi-core machine with 20GB of RAM."

7 of 113 comments (clear)

  1. Porn Collection by sycodon · · Score: 5, Funny

    Can it sort and identify duplicates automagically in my porn collection?

    --
    When Fascism comes to America, it will call itself Anti-Fascism, and tell you to give up your guns.
    1. Re:Porn Collection by Impy+the+Impiuos+Imp · · Score: 5, Funny

      Can it sort and identify duplicates automagically in my porn collection?

      Sure! It sorted your stuff into these categories:

      400-lb. naked guys kissing
      Stuff reported to the NSA
      Someone's drawing of a dragon humping a car
      Taylor Swift

      Over 750,000 pictures in all!

      --
      (-1: Post disagrees with my already-settled worldview) is not a valid mod option.
  2. 20GB?? That's it??? by rosshalz · · Score: 5, Funny

    -"... using nothing but a single, multi-core machine with 20GB of RAM" Phew.. here i was thinking it'd need some unrealisticalll high specs from my PC!!

  3. Re:Coming to mobile? by real-modo · · Score: 5, Informative

    Wait, your phone can decode video?!? In real time, playing the movies at normal speed? How many kilograms does it weigh, and how long is the power lead? How big is the mortgage on it? (/socraticmethod)

    The computer innovation process broadly goes like this: first algorithm sort-of works but is incredibly inefficient - tweaks on this - a rethinking of the whole approach that leads to massive speed-ups - further refinement - implementation of the algorithm in hardware, where it becomes just another specialized processor - everybody profits!.

    This article is about the third, or possibly fourth, phase of the process. If it it works out, phase 5 is straightforward. By itself, step 5 typically leads to two orders of magnitude increase in performance, three orders of magnitude decrease in power consumption, and two to four orders of magnitude decrease in cost.

    Phases 6 and 7 happen if and when enough people find the provided service useful. (If technologies are no good, that's when only rich people have them. Successful technologies, everyone gets access to eventually.)

  4. Re:Spatial Hashing by Anonymous Coward · · Score: 5, Informative

    Yes, it's a breakthrough. It won the best paper award at this year's Conference on Computer Vision and Pattern Recognition, a tier 1 computer vision conference.

    Hashing invarient properties in images isn't new, but,

    banded winner-take-all hashing of histograms-of-oriented-gradient part filters and then using matches across those bands to identify a test feature's nearest neighbors, while simultaneously computing an upper bound or exact dot products of those test features with their nearest learned features, for up to 100,000 objects with small amounts of memory, is new.

  5. Re:Coming to mobile? by 91degrees · · Score: 4, Funny

    Phase 7 is profits. You obviously assumed phase 6 was "???".

  6. Re:Captcha's be gone? by Anonymous Coward · · Score: 5, Funny

    So Captcha's will become even easier to crack? Great, the sooner we can get rid of them, the better. As it is they are getting impossible to read by humans, thanks to idiots who don't know how to design them.

    But there's no need to get rid of them if we'll all have a handy browser plugin that can decode them for us at the press of a button!