A Rock Paper Scissors Brainteaser
New submitter arsheive (609065) writes with a link to this interesting RPS brainteaser: "How do you play against an opponent who _must_ throw Rock 50% of the time, and how much would you be willing to pay to play against them?"
Rock, paper, scissors, lizard, Spock!
Scissors cut paper
Paper covers rock
Rock crushes lizard
Lizard poisons Spock
Spock smashes scissors
Scissors decapitate lizard
Lizard eats paper
Paper disproves Spock
Spock vaporizes rock
Rock crushes scissors
Political debates have me rolling my eyes so much I think I got optical whiplash. I should sue. - Foamy The Squirrel
Oooh, not rock again!!!!
actually, as each choice beats one and either ties or is beaten by the other two, the odds of winning any random round of RPS is 33%.
But the whole point is that it is not random, since your opponent's choices are constrained. He is forced to play rock 50% of the time. If he plays randomly the other 50%, then you can always play paper and win 2/3rds of the time, lose 1/6 of the time, and tie 1/6th. But then he can adapt, and play rock 50% and scissors the other 50%, resulting in a tie (1/3 win, 1/3 loss, 1/3 tie). But you can adapt to that by playing rock 50% and paper 50%. You will win 50%, lose 25%, and tie 25%.
I've been having students in my introductory programming courses work on this class of problem for a few years.. They all seem to really enjoy it. I code up bots to play RPS with certain biases just like the OP and they have to program a single player that identifies the bias in an opponent and adjusts its play to give it an advantage. They all routinely can generate solutions that perform far better than random against predictable, dumb bots, but things get very interesting when I throw the students' bots against each other in a throwdown tournament. :)
From TFA: "At the start of each round an independent judge flips a fair coin and tells your opponent the result but does not tell you. If the coin came up heads your opponent must play rock."
The opponent isn't forced to get at least 50% rock after any number of plays.
With no such restriction, random choices on both sides lead to 33% win, 33% draw, 33% loss, right? With the opponent throwing Rock 50% of the time, assuming the other 50% is evenly divided between Paper and Scissors, if I always throw paper I'll win 50% of the time, lose 25% of the time, and draw 25% of the time.
So depending how the betting works, I'd be pretty willing.
No, you are describing another game where the opponent is forced to play rock 50% of the time, paper 25% of the time and scissors 25% of the time.
This game is different.
Actually now that I think about it more
... you realize that you're doing the submitter's homework?
Required reading for internet skeptics
You should play paper 4/6 of the time, rock 1/6, and scissors 1/6 of the time.
The key (if you RFTA) is that whether or not your opponent plays rock is determined by a coin toss. So really you are playing a compound game. You are playing a coin toss and rock paper scissors (RPS). Since the coin toss determines your opponents move, you can think of it as playing 50% coin toss and 50% RPS. The RPS is a subgame of the coin toss.
Since the coin toss is the dominate game, you play with win that first. But instead of heads/tails, it is paper/other. The answer to the coin toss is a 50/50 guess of heads/tails, so the answer to the paper/other is 50% paper, 50% other.
The "other" is the RPS game. And since the answer to the RPS game is 1/3 rock, 1/3 paper, 1/3 scissors, we know what the solution to the other 50% of the game is.
So the equations are:choice = (Coin Toss) + (RPS) so: paper = 1/2 + 1/3, rock = 0 + 1/3, scissors = 0 + 1/3. Or paper = 4/6, rock = 1/6, scissors = 1/6.
First, make sure you read TFA, since it explains what the summary doesn't: how the 50% is determined and how the opponent can play in the non-forced turns.
If you play using a deterministic algorithm, for example always play paper, the opponent can figure it out and beat you on all the non-forced turns. At best you'll get an even result.
If you play using a random algorithm, the opponent can figure out the frequencies you're using and compensate for that. For example, if you decide to play paper 50% of the time and rock and scissors 25% of the time, you'd win against an opponent playing rock 50% of the time and paper and scissors 25% of the time. However, if the opponent decides to play rock 50% of the time and scissors the other 50%, the result is even again. If the opponent would be forced to play rock more than 50% of the time, there is no room to compensate and you would win consistently with 100% paper. I think that with 50% rock, there is enough room to respond to any frequency distribution you can come up with, although I have no proof for that.
You could change your algorithms during play, but if there isn't any algorithm that results in an advantage when playing it consistently, gaining an advantage from changing your algorithm would depend on how well your opponent responds to your changes. In other words, you're playing mind games. I don't think the 50% rock restriction is going to be of any help here.
...but...but... but it's interesting homework! :-)
Assorted stuff I do sometimes: Lemuria.org
The Nash Equilibrium is for you to play paper 2/3rds of the time, and rock 1/3rd. His best counter strategy is to play rock 50% (he cannot go lower) and scissors 50%. He cannot do better. If you deviate from 2/3 paper and 1/3 rock, he can adjust his strategy to do better. With the optimal strategy, you will win 1/2, lose 1/3, and tie 1/6.
Here is my search for the Nash Equilibrium:
#include
struct rps {
double rock;
double paper;
double scissors;
};
static double
eval(struct rps *a, struct rps *b)
{
return
(a->rock * (b->scissors - b->paper)) +
(a->paper * (b->rock - b->scissors)) +
(a->scissors * (b->paper - b->rock));
}
int
main(void)
{
struct rps you;
struct rps him;
him.rock = 0.5;
double worst_best_eval_for_him = 1.0;
double best_rock_for_you = 0;
double best_paper_for_you = 0;
double worst_best_paper_for_him = 0;
double dx = 0.001;
for (you.rock = 0; you.rock best_eval_for_him) {
best_eval_for_him = p;
best_paper_for_him = him.paper;
}
}
if (worst_best_eval_for_him > best_eval_for_him) {
worst_best_eval_for_him = best_eval_for_him;
best_rock_for_you = you.rock;
best_paper_for_you = you.paper;
worst_best_paper_for_him = best_paper_for_him;
}
}
}
printf("worst_best_eval_for_him = %f\n", worst_best_eval_for_him);
printf("best_rock_for_you = %f\n", best_rock_for_you);
printf("best_paper_for_you = %f\n", best_paper_for_you);
printf("worst_best_paper_for_him = %f\n", worst_best_paper_for_him);
return 0;
}
Sorry, but Slashdot mangled that code badly because of the angle brackets.
Let me try again:
#include <stdio.h>
struct rps {
double rock;
double paper;
double scissors;
};
static double
eval(struct rps *a, struct rps *b)
{
return
(a->rock * (b->scissors - b->paper)) +
(a->paper * (b->rock - b->scissors)) +
(a->scissors * (b->paper - b->rock));
}
int
main(void)
{
struct rps you;
struct rps him;
him.rock = 0.5;
double worst_best_eval_for_him = 1.0;
double best_rock_for_you = 0;
double best_paper_for_you = 0;
double worst_best_paper_for_him = 0;
double dx = 0.001;
for (you.rock = 0; you.rock < 1.0; you.rock += dx) {
for (you.paper= 0; (you.paper + you.rock) < 1.0; you.paper+= dx) {
you.scissors = 1.0 - you.rock - you.paper;
double best_paper_for_him = 0.0;
double best_eval_for_him = -1.0;
for (him.paper = 0; him.paper < 0.5; him.paper += dx) {
him.scissors = 1.0 - him.rock - him.paper;
double p = eval(&him, &you);
if (p > best_eval_for_him) {
best_eval_for_him = p;
best_paper_for_him = him.paper;
}
}
if (worst_best_eval_for_him > best_eval_for_him) {
worst_best_eval_for_him = best_eval_for_him;
best_rock_for_you = you.rock;
best_paper_for_you = you.paper;
worst_best_paper_for_him = best_paper_for_him;
}
}
}
printf("worst_best_eval_for_him = %f\n", worst_best_eval_for_him);
printf("best_rock_for_you = %f\n", best_rock_for_you);
printf("best_paper_for_you = %f\n", best_paper_for_you);
printf("worst_best_paper_for_him = %f\n", worst_best_paper_for_him);
return 0;
}
there is a factual problem with the summary...
It is a bad summary, but only because the wording is ambiguous, not that it's factually incorrect. The statement you're objecting to is perfectly correct in one interpretation, and dead wrong in another. Your own counter-statement, "it is not required of the opponent to play rock 50% of the time," is equally ambiguous. In fact, 50% of the time (assuming a fair coin), the opponent is required to play rock, so it's true that "it is required of the opponent to play rock, 50% of the time". Leaving out the comma yields a true sentence (assuming the correct interpretation is chosen of the now even more ambiguous sentence) that contradicts the quoted sentence of yours, assuming you parse your sentence as "it is not required: that the opponent play rock 50% of the time", but does not contradict it at all if your sentence is parsed "it is not required that the opponent play rock, 50% of the time", since 50% of the time, the opponent can choose freely, and thus is 50% of the time, is not required to make any particular choice. So, both the summary and your explanation of what's wrong with it contain statements that not factually incorrect, just ambiguously worded such that a reader might interpret it to mean something that is incorrect rather than something that is correct.
This is a textbook example of why programming computers in plain English would be a monstrously bad idea.
"Convictions are more dangerous enemies of truth than lies."
Of course. You use the rock to smash him in the head while he tries to stab you with the scissors. Your friend then uses the paper to write a letter to your parents about how you died in a stupid fight about statistics.
Get free satoshi (Bitcoin) and Dogecoins
A hem! I'm a frayed knot.
You kick him in the testicles and let him keep the chicken.
As some have said, the optimum is a mixed strategy of playing paper 2/3 of the time and rock the rest of the time.
Let's say we're player 1 and they're player 2. The way to calculate this exactly is to first observe that the general form for each players mixed strategy is:
P(r1) + P(p1) + P(s1) = 1
1/2 + P(p2) + P(s2) = 1
where, for example, P(r1) is the probability of a rock being played by player 1.
This means that for a given P(r1), P(p1) and P(p2),
P(s1) = 1 - P(r1) - P(p1) [1]
P(s2) = 1/2 - P(p2) [2]
Now observe that the probability of player 1 winning, P(W), is the sum of the probabilities of each of the winning configurations occurring,
P(w) = P((r1 and s2) or (p1 and r2) or (s1 and p2))
= P(r1 and s2) + P(p1 and r2) + P(s1 and p2)
= P(r1)*P(s2) + P(p1)*P(r2) + P(s1)*P(p2)
After substituting from [1] and [2] we get the following,
P(W) = P(r1)*(1/2 - P(p2)) + P(p1)*(1/2) + (1 - P(r1) - P(p1))*P(p2)
Let's make this easier to read:
x = P(r1)
y = P(p1)
z = P(p2)
So then we have:
P(W) = x*(1/2 - z) + y/2 + (1 - x - y)*z
Similarly, we find that the probability of losing, P(L), is:
P(L) = x*P(p2) + y*(1/2 - z)+ (1 - x - y)/2
Now if we gain by 1 for winning, gain by -1 for losing (aka make a loss by 1) and nothing happens when we draw, then the first players expected gains, F, is given by:
F(x, y, z) = (1)*P(W) + (0)*P(D) + (-1)*P(L) = x + y/2 - 1/2 + z*(1 - 3*x)
[Note that we didn't bother finding the probability of drawing because it has no effect on the value of the expected gains]
Now, recall that player 1 controls x and y (the probability of player 1 playing rock and paper respectively) and player 2 controls z (the probability of player 2 playing paper). Be aware that x and y can at least be zero and at most be 1 minus the other. Also note that z can only range from 0 to 1/2 (because the other half of the time player 2 must throw a rock). Finally, you should understand that player 1 wants to maximize F and player 2 wants to minimize F.
From this equation it is clear that when x is less than or equal to a 1/3, player 2 should minimize z in order to minimize F. So in this case,
F(x, y, z) = F(x, y, 0) = x + y/2 - 1/2
and player 1 would maximize this by setting x = 1/3 (as was assumed for this case) and y = 0 which gives F(1/3, 0, 0) = 1/3 - 1/2 = 1/6.
If on the other hand x were set to a probability greater than or equal to 1/3, player 2 would want to maximize z. This would give,
F(x, y, z) = F(x, y, 1/2) = x + y/2 - 1/2 + (1/2)*(1 - 3*x) = (y - x)/2
Player 1 would maximize this by minimizing x (x = 1/3) and maximizing y (y = 2/3 as x + y = 1),
F(x, y, z) = F(1/3, 2/3, 1/2) = ((2/3) - (1/3))/2 = 1/6.
Finally it's nice to observe that when player 1 sets x = 0, the second player's choice of z has no effect on the gains function whatsoever.
So, in conclusion, the optimal strategy for player 1 is x = 1/3 and y = 2/3 where x is the probability of player 1 playing rock and y is the probability of player 1 playing paper.