Machine Figures Out Rubik's Cube Without Human Assistance (technologyreview.com)

← Back to Stories (view on slashdot.org)

Machine Figures Out Rubik's Cube Without Human Assistance (technologyreview.com)

Posted by BeauHD on Sunday June 17, 2018 @02:14AM from the hands-free dept.

An anonymous reader quotes a report from MIT Technology Review: [Stephen McAleer and colleagues from the University of California, Irvine] have pioneered a new kind of deep-learning technique, called "autodidactic iteration," that can teach itself to solve a Rubik's Cube with no human assistance. The trick that McAleer and co have mastered is to find a way for the machine to create its own system of rewards. Here's how it works. Given an unsolved cube, the machine must decide whether a specific move is an improvement on the existing configuration. To do this, it must be able to evaluate the move. Autodidactic iteration does this by starting with the finished cube and working backwards to find a configuration that is similar to the proposed move. This process is not perfect, but deep learning helps the system figure out which moves are generally better than others. Having been trained, the network then uses a standard search tree to hunt for suggested moves for each configuration.

The result is an algorithm that performs remarkably well. "Our algorithm is able to solve 100% of randomly scrambled cubes while achieving a median solve length of 30 moves -- less than or equal to solvers that employ human domain knowledge," say McAleer and co. That's interesting because it has implications for a variety of other tasks that deep learning has struggled with, including puzzles like Sokoban, games like Montezuma's Revenge, and problems like prime number factorization. The paper on the algorithm -- called DeepCube -- is available on Arxiv.

43 of 86 comments (clear)

Min score:

Reason:

Sort:

Interesting name for a game by Anonymous Coward · 2018-06-17 02:20 · Score: 1

:D :D :D :D
Traveler's diarrhea - Wikipedia
https://en.wikipedia.org/wiki/Traveler%27s_diarrhea
"Montezuma's revenge (var. Moctezuma's revenge) is a colloquial term for traveler's diarrhea contracted in Mexico."
Wow amazing! by 110010001000 · 2018-06-17 02:21 · Score: 1, Funny

This is great. Now that we have mastered Chess, Go, and Rubkis Cube all of these "researchers" will put them to work solving meaningful problems. Because AI. Right?
1. Re:Wow amazing! by TFlan91 · 2018-06-17 02:31 · Score: 4, Insightful
  
  Games are easy for "AI" because games have strict rules that a modeler can account for/predict.
2. Re:Wow amazing! by religionofpeas · 2018-06-17 04:04 · Score: 3, Insightful
  
  Games are easy for "AI" because games have strict rules
  Just because the rules are strict (or even simple) does not mean that the game is easy. You can achieve arbitrary complexity by iterating the rules a large number of times. For example, the rules of Go are strict, the question whether a given board position is winning for white is hard. The rules of a programming language are strict. Writing a Linux kernel is hard. The rules of math are strict. Providing a proof for Fermat's last theorem is hard. The rules of physics and soccer are strict. Making a robot that can beat a human at the game is hard.
3. Re:Wow amazing! by gweihir · 2018-06-17 05:40 · Score: 1
  
  Hehehehe, nice. No, not actually AI, just a planning algorithm as being used and researched for something like > 50 years now. This is another instance of machines getting faster, not of them getting any smarter. On the face of it, the Rubic's cube is a very simple problem with a very simple description and a low number of states. Sure, the number of states is actually pretty large when seen absolutely, but for a planning problem, it is not that large and, in particular, the score for a state is downright simplistic: The number of moves to it being solved.
  Real-world planning problems are nowhere near that simple. Hence while we will continue to see these stunts, we will not see any real-world problems solved this way for a long, long time, if ever. Also take into account that single-core CPU speed scaling is dead and that most planning algorithms are, in the end, strongly constrained by single-core speeds.
  
  --
  Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
4. Re:Wow amazing! by gweihir · 2018-06-17 05:41 · Score: 1
  
  Exactly.
  
  --
  Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
5. Re:Wow amazing! by sfcat · 2018-06-17 10:54 · Score: 1
  
  Hehehehe, nice. No, not actually AI, just a planning algorithm as being used and researched for something like > 50 years now. This is another instance of machines getting faster, not of them getting any smarter. On the face of it, the Rubic's cube is a very simple problem with a very simple description and a low number of states. Sure, the number of states is actually pretty large when seen absolutely, but for a planning problem, it is not that large and, in particular, the score for a state is downright simplistic: The number of moves to it being solved.
  Real-world planning problems are nowhere near that simple. Hence while we will continue to see these stunts, we will not see any real-world problems solved this way for a long, long time, if ever. Also take into account that single-core CPU speed scaling is dead and that most planning algorithms are, in the end, strongly constrained by single-core speeds.
  That's not all that off base but its not quite correct either. You are just missing some context. So this appears to be using a part of AI called Reinforcement Learning (I did my thesis on RL at CMU 20 years ago). So in RL, there are a bunch of techniques for taking a pre-defined domain with a reward function that shapes the behavior learned by the system. There are plenty of algorithms (Value Iteration, Polity Iteration, Q-Learning, etc) for solving a problem with an existing reward function and a discrete set of states (and actions).
  But now we want to learn the reward function somehow. This is where the current research is focused. To that end, there are techniques to try to learn this discrete reward function that best supports the learning process (Double Q Learning, Actor-Critic, and others). This research in the article seems to be another technique to learn these reward functions but in this case, there was still an existing model of the game (the states of the cube). But in this case, it seems they just reinvented Value Iteration (which is about 30 years old), slapped a new name on it, but didn't seem to link up their work with where other RL researchers. So just a big yawn...
  
  --
  "Those that start by burning books, will end by burning men."
6. Re:Wow amazing! by gl4ss · 2018-06-17 19:49 · Score: 1
  
  it's still far from AI though.
  making a robot that runs around after a ball and would fullfill the rules of soccer is indeed hard.
  also in this case, the "AI" was shown the wanted end result. it's all very neat except the headline says it solved it by it's own, which would have been a feat if it figured out semi randomly on it's own that this is how it's supposed to be presented for someone to be impressed. after that it's just trial and error loop so remind me again how is this ai?
  if you had _Actual_ frigging ai teaching it to play soccer would be fairly easy, but first you would need to ask it to make a robotic vessel for it to play with anyways.
  AND SERIOUSLY SOCCER IS THE MOST LEAST STRICT FUCKING PIECE OF GARBAGE ON HIGH LEVEL YOU CAN FUCKING IMAGINE. IT DOESN'T EVEN MATTER IF PEOPLE CAN SEE YOU CHEATING. so what would probably happen is that it would just straight up kill the opposing teams players when the ref isn't looking, because it's fair game if the ref doesn't catch it then and there.
  
  --
  world was created 5 seconds before this post as it is.
7. Re:Wow amazing! by religionofpeas · 2018-06-18 00:10 · Score: 1
  
  after that it's just trial and error loop so remind me again how is this ai?
  It's not just trial and error. The Rubik's cube has 10 to the power of 19 combinations, and most of them look like fully scrambled cubes. You cannot randomly try things until you stumble on the solution. The AI part is where it learns the patterns that tell it that it's making progress. Most humans who have come up with a solution to the Rubik's cube start by first solving one side, and then the second layer, and then the top. This AI system has done something similar, except it doesn't work in layers, but came up with its own intermediate stages.
With one exception; the goal state by Anonymous Coward · 2018-06-17 02:28 · Score: 2, Insightful

Someone had to tell it what is a solution. If you give it a solved cube, that's assistance. Is it really that hard not to inflate headlines?
1. Re:With one exception; the goal state by PPH · 2018-06-17 04:19 · Score: 4, Funny
  
  If you give it a solved cube
  And you give it a scrambled cube. The AI shouts "Hey look! Haley's comet!" And while you are looking up, it switches them.
  Turing test: Passed.
  
  --
  Have gnu, will travel.
2. Re:With one exception; the goal state by Hognoxious · 2018-06-17 04:40 · Score: 1
  
  I did wonder how it decided for itself that the completed cube was somehow "better".
  
  --
  Confucius say, "Find worm in apple - bad. Find half a worm - worse."
Odd definition of "without human help" by Entrope · 2018-06-17 02:44 · Score: 5, Insightful

This algorithm was able to figure out how to solve Rubik's Cube with no help from humans other than humans providing the (simulated) cubes, describing what the solution looks like, and designing an algorithm specific to solving Rubik's Cube?
Color me less than impressed.
1. Re:Odd definition of "without human help" by DontBeAMoran · 2018-06-17 03:15 · Score: 1
  
  Color me less than impressed.
  What do you mean? White, red, blue, yellow, orange or green?
  
  --
  #DeleteFacebook
2. Re:Odd definition of "without human help" by gweihir · 2018-06-17 05:42 · Score: 2
  
  Oh, it still is a nice result. But you are describing exactly the core problem with it: Everything was clear and described in simple, clear statements from the start. That is not how a real-world problem presents itself.
  
  --
  Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
3. Re:Odd definition of "without human help" by religionofpeas · 2018-06-17 07:27 · Score: 1
  
  Everything was clear and described in simple, clear statements from the start. That is not how a real-world problem presents itself.
  That's exactly how the problem is described on the box when you buy a Rubik's cube.
4. Re:Odd definition of "without human help" by gweihir · 2018-06-17 07:53 · Score: 1
  
  So?
  
  --
  Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
5. Re:Odd definition of "without human help" by parkinglot777 · 2018-06-18 02:23 · Score: 1
  
  If you are talking about an algorithm to get from the start to end, it could be interesting. However, this is what I don't see useful is that solution of Rubik is simple a pattern from one to the other. Thus, you need to run the algorithm only once to create links to each of pattern to the solution state. Even though there are 6 faces, it is still possible to store them all. Why? Because that's how those Rubik geniuses do -- memorize certain patterns from one to the other. A computer should be able to do it better than human.
GOAP? by Anonymous Coward · 2018-06-17 02:50 · Score: 1

Sounds like GOAP - goal orientated action planning. You start at the end state then perform actions (in reverse) until you get to the current state. I read the article which doesn't tell you much more about how they did it. It sounds like they brute forced a bunch of moves to build up a tree then used A* on the tree. Then they trained a neural net on the brute-forced solutions. They talk about evaluating how close a cube state is to the goal state, but they don't explain how the AI determined that. It sounds like they hard coded what closeness means, so they're lying when they say the AI worked without human assistance. Without human assistance means they would have needed to use a GA, self-playing, or some other learning method to determine what closeness means. The article's "without human assistance" refers to them not hard-coding any move sequences. I consider that a very far stretch of the word. I'd call hard-coding moves as cheating. If you give it everything it needs, it turns an AI into an algorithm in my mind.
Here's a link to the paper from the article. I don't have the time to read it right now: https://arxiv.org/pdf/1805.07470.pdf
1. Re:GOAP? by sfcat · 2018-06-17 11:09 · Score: 1
  
  Sounds like GOAP - goal orientated action planning. You start at the end state then perform actions (in reverse) until you get to the current state. I read the article which doesn't tell you much more about how they did it. It sounds like they brute forced a bunch of moves to build up a tree then used A* on the tree. Then they trained a neural net on the brute-forced solutions. They talk about evaluating how close a cube state is to the goal state, but they don't explain how the AI determined that. It sounds like they hard coded what closeness means, so they're lying when they say the AI worked without human assistance. Without human assistance means they would have needed to use a GA, self-playing, or some other learning method to determine what closeness means. The article's "without human assistance" refers to them not hard-coding any move sequences. I consider that a very far stretch of the word. I'd call hard-coding moves as cheating. If you give it everything it needs, it turns an AI into an algorithm in my mind.
  Here's a link to the paper from the article. I don't have the time to read it right now: https://arxiv.org/pdf/1805.074...
  Search is the basic AI problem and thus part of AI. And no, they are not using GOAP here (which is just a type of search). It looks a lot more like Value Iteration which is a type of RL (Reinforcement Learning). The point of this research is that it learns to not blindly try random things but learns to search in a more intelligent fashion. But unfortunately for them, their technique was invented decades ago and so just slapping a new name on it doesn't really impress anyone.
  
  --
  "Those that start by burning books, will end by burning men."
Re:Yawn by infolation · 2018-06-17 02:50 · Score: 4, Insightful

From TFA:

it has implications for a variety of other tasks that deep learning has struggled with, including... problems like prime number factorization
If it could help with finding the prime factorization of large semi-prime numbers – ie two or more prime numbers that multiplied together result in a target original number - then that would be quite useful.

*cough* cryptography
Re:Yawn by AHuxley · 2018-06-17 03:01 · Score: 3, Funny

Assembling any type of IKEA furniture from the box?

--
Domestic spying is now "Benign Information Gathering"
It just created a look-up table. by Fly+Swatter · 2018-06-17 03:02 · Score: 1

Find state of cube and look up the next move, the lookup table could also include every move from that point to solve with very little future work. Granted, creating that table efficiently is where the magic happens.
1. Re:It just created a look-up table. by religionofpeas · 2018-06-17 04:07 · Score: 1
  
  The number of possible cube configurations is too large to make a lookup table practical.
Re:starting with the finished cube by DontBeAMoran · 2018-06-17 03:14 · Score: 2

The same way rich people "learn" how to become rich.

--
#DeleteFacebook
Long lost by AndyKron · 2018-06-17 03:18 · Score: 1

Searching for long forgotten Sokoban.exe game now... Found it.
Actually that's a great idea: knapsack problem by goombah99 · 2018-06-17 03:20 · Score: 2

While Ikea furniture is designed with assembly in mind other things are not. Say for example, an airplane. So the assembly process might not be optimal. Letting the computer look for a more optimal process might be useful.
Or more practically, packing items into a shipping box. the famous knapsack problem.
I hate these slashdot summaries of algorithms. you end up thinking gosh that's stupid. When it's not. just the description is stupid. like a car analogy

--
Some drink at the fountain of knowledge. Others just gargle.
1. Re:Actually that's a great idea: knapsack problem by religionofpeas · 2018-06-17 03:57 · Score: 1
  
  While Ikea furniture is designed with assembly in mind other things are not. Say for example, an airplane.
  I would assume that an airplane designer is considering assembly (and maintenance, which is even harder) in every part of the design.
2. Re: Actually that's a great idea: knapsack problem by religionofpeas · 2018-06-17 07:24 · Score: 1
  
  If you can make the plane 5% lighter with twice the assembly time, they go for it.
  In order to make the optimal choice, you have to know in advance how much assembly time is required for each of the various options.
3. Re: Actually that's a great idea: knapsack problem by zwarte+piet · 2018-06-18 01:31 · Score: 1
  
  My guesses are usually in the 1700% ballpark
Brute force method... by cre1mer · 2018-06-17 03:28 · Score: 1

The fastest way to reset a Rubik's Cube is to pull it apart and reassemble it. That works well on the original Rubik's Cube (3 x 3 x 3). Other variations (4 x 4 x 4, 5 x 5 x5, or 17 x 17 x 17) are increasingly difficult as the parts get smaller.

--
Goodbye, Slashdot!
1. Re:Brute force method... by gweihir · 2018-06-17 05:45 · Score: 1
  
  Overall, when starting from nothing, this _is_ the fastest way. I figured this out as a teenager. But is also is a way these "AI" things cannot come up with, because they do not actually have any general intelligence. They cannot think "outside of the box" at all. Still lots of applications for this, but replacing a smart person is not among them.
  
  --
  Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
Games??!! by PPH · 2018-06-17 04:02 · Score: 1

If ever you've traveled, you know that Montezuma's Revenge is no game!

--
Have gnu, will travel.
Re:Backwards search basically by PPH · 2018-06-17 04:11 · Score: 1

This is what I was thinking of as well (putting my 37 year old copy of The Handbook of Artificial Intelligence back on the shelf).

--
Have gnu, will travel.
Re:Bring on the robot overlords by PPH · 2018-06-17 04:30 · Score: 1

Dominance. Control. Accumulation of wealth. Fighting over resources. etc ....
Heuristics optimization. It has worked well for us so far.

Humanity would be best served by AI overlords.
And yet in practically every book or movie involving the 'benign' transfer of social control over to benevolent overlords, the humans end up unhappy. Even if not explicitly stated, such control is generally perceived as 'evil'.

--
Have gnu, will travel.
Something's not adding up by Dynedain · 2018-06-17 04:36 · Score: 2

Either the article writer didn't understand the whitepaper, or the researchers haven't actually done anything novel.

Having been trained, the network then uses a standard search tree to hunt for suggested moves for each configuration.

This works because the beginning state and end state of a Rubik's Cube are effectively identical. It's the same number of tiles, in a specific arrangement. As humans, we've defined the "solved" state to be all the tiles color-matched to a side. But the "solved" state could just as arbitrarily be any pattern or arrangement of colors across the cube.
Reversing the simulation to work backwards from the "solved" to some specific state of scrambled is exactly the same problem as starting from some specific state of scrambled and trying to get to the solved.

--
I'm out of my mind right now, but feel free to leave a message.....
1. Re:Something's not adding up by religionofpeas · 2018-06-17 07:15 · Score: 1
  
  The summary and article are quite confusing. They started with solved cubes, and scrambled them in different amounts to generate training data so they could train a neural net to give the most likely candidate turns that would solve the cube.
  After the network is trained, you can give it a random scrambled cube, and it will do a Monte-Carlo tree search based on the guidance of the neural net.
But we already know a shortcut to solve it by corezz · 2018-06-17 07:22 · Score: 1

There are countless videos online that show the same simple technique that can solve any rubix cube. You repeat the same movements over and over -- even blindfolded -- and its solved. Depending on your hand speed you can solve any combination in under a few minutes. I had my mom, who has never seen a rubix cube, repeat the same steps and she was able to solve it with ease. Then she asked what a Rubix Cube was for/about. 'nuf said.
Re:gweihir, a question... apk by gweihir · 2018-06-17 07:59 · Score: 1

I am going to assume you really are APK.
I very, very rarely post as AC and I never sign it when I do. I also never do it to rile anybody up. I most definitely did not write that posing.
There are some deficients here that cannot stand that I can understand things they are incapable of understanding and that I dare to tell them they are stupid when they have written (again) something extremely stupid. Cowardly, dishonorable and utterly pathetic trolling.

--
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
Re:Backwards search basically by sfcat · 2018-06-17 11:04 · Score: 1

This is what I was thinking of as well (putting my 37 year old copy of The Handbook of Artificial Intelligence back on the shelf).
I think you mean Norvig Russell...here is a free PDF Artificial Intelligence: A Modern Approach

--
"Those that start by burning books, will end by burning men."
Re:Backwards search basically by PPH · 2018-06-17 11:34 · Score: 1

You couldn't find one older?

--
Have gnu, will travel.
Re:It's me & thanks for honesty... apk by gweihir · 2018-06-17 12:03 · Score: 1

I have had them from time to time, but apparently I am not interesting enough to rate permanent stalkers. Now I will make damned sure to not ever do any AC postings that resemble this crap, you have my word. Not that I ever intended to do anything like this, it is just completely dishonorable and to me, that counts for something. People sniping from the dark are destroyers of communities and have not place in civilized society.
So if it is AC and claims to be from me, it is not.

--
Most ACs are not even worth the keystrokes to insult them. Be generically insulted by this and ignored otherwise.
HS Trig proofs by guy_scree · 2018-06-17 12:06 · Score: 1

Some trig tests consisted of proposed equalities. You had to determine which ones were valid. My technique was to assume it was true, and work backwards. The teacher objected to my starting off assuming it was true. So I wrote the steps from the bottom of the answer box to the top, announcing I had derived it the proposition from a known equality. Teacher couldn't say a damn thing. Just goes to show the quality of math teaching in the 1950s.