MIT Machine Vision System Figures Out What It's Looking At By Itself (gsmarena.com)

← Back to Stories (view on slashdot.org)

MIT Machine Vision System Figures Out What It's Looking At By Itself (gsmarena.com)

Posted by BeauHD on Monday September 10, 2018 @01:25PM from the all-grown-up dept.

MIT's "Dense Object Nets" or "DON" system uses machine vision to figure out what it's look at all by itself. "It generates a 'visual roadmap' -- basically, collections of visual data points arranged as coordinates," reports Engadget. "The system will also stitch each of these individual coordinate sets together into a larger coordinate set, the same way your phone can mesh numerous photos together into a single panoramic image. This enables the system to better and more intuitively understand the object's shape and how it works in the context of the environment around it." From the report: [T]he DON system will allow a robot to look at a cup of coffee, properly orient itself to the handle, and realize that the bottom of the mug needs to remain pointing down when the robot picks up the cup to avoid spilling its contents. What's more, the system will allow a robot to pick a specific object out of a pile of similar objects. The system relies on an RGB-D sensor which has a combination RGB-depth camera. Best of all, the system trains itself. There's no need to feed the AI hundreds upon thousands of images of an object to the DON in order to teach it. If you want the system to recognize a brown boot, you simply put the robot in a room with a brown boot for a little while. The system will automatically circle the boot, taking reference photos which it uses to generate the coordinate points, then trains itself based on what it's seen. The entire process takes less than an hour. MIT published a video on YouTube showing how the system works.

1 of 36 comments (clear)

Min score:

Reason:

Sort:

Define "figures it out" by mrwireless · 2018-09-10 14:22 · Score: 4, Insightful

The robot in the video has to be manually shown where to hold the shoe (by the lip). It then understands that it should grab all shoes by the lip.

While it's impressive that image recognition is moving into 3D here, the actual 'figuring it our' step seems to be a matter of definition.

I suspect the robot didn't figure out that a cup should be kept upright by itself either. After all, that would mean that the robot somehow concludes that liquids should not be spilled. That would require a much higher level of cognition.

It's the use of vague words that facilitates the rampant spreading of hype. This inflation of what words mean will harm the AI sector in the long run. Just like most mainstream people are now difficult to get excited about any actual innovation in 3D printing field - out collective excitement reservoir has been depleted.

The notion that self driving cars can be classified into 5 levels of self driving prowess has reached quite a large mainstream audience. Perhaps that concept can be extended to all 'AI'.

1. Handy things. Software that automates things, with a well designed internal ruleset.
2. Smart things. Automation via machine learning, can have a level of unpredictability, but only because common sense cannot predict patterns in big data. Should have been called 'machine learning', and algorithms instead of AI.
3. The next level. I have no idea how we will get here, or what to call it. Understanding Algorithms? Might create really complex classifications of the world around it, and infer things form that. Robots actually figuring out the laws of physics, and copying our value system ("don't spill coffee").
4. The level after that. What to call this now that 'smart' is already taken? 'artificial intelligence' would actually be a good name for this level. Perhaps we will see emergent phenomena cognitive phenomena develop out of sheer complexity. I doubt it though.
5. Artificial Consciousness. Systems that have a sense of identity, ethics and 'real' empathy. Perhaps "Sci-Fi AI" is another fun name for this stage.