A Rock Paper Scissors Brainteaser
New submitter arsheive (609065) writes with a link to this interesting RPS brainteaser: "How do you play against an opponent who _must_ throw Rock 50% of the time, and how much would you be willing to pay to play against them?"
Come again? It is specified that the player forced to play 50% rock (as mandated by a fair coin flip you don't get to see) plays intelligently and will adapt to your play. When he figures out you're always playing rock, he'll always play paper when he doesn't have to play rock. You tie 50% of the time and lose 50% of the time. That's lousy strategy.
You're guaranteed to break even by always playing paper. When the opponent adapts, he'll always play scissors when he doesn't have to play rock, and you win 50% of the time and lose 50% of the time. The question is, can you do better than that?
Expect them to play scissors a lot to beat your paper. Play rock as often as they play scissors.
After that you need to run a probability set on all the possible combinations, with the unknowns for his paper and scissors, only knowing that they total to 50%.
We were all warned a long time ago that MS products sucked, remember the Magic 8 Ball said, "Outlook not so good"
Rock, paper, scissors, lizard, Spock!
Scissors cut paper
Paper covers rock
Rock crushes lizard
Lizard poisons Spock
Spock smashes scissors
Scissors decapitate lizard
Lizard eats paper
Paper disproves Spock
Spock vaporizes rock
Rock crushes scissors
Political debates have me rolling my eyes so much I think I got optical whiplash. I should sue. - Foamy The Squirrel
I think the winning strategy is to randomly throw 50% paper to cover his rock. I'm just guessing though. No idea how much to pay.
For all intensive purposes, "whom" is no longer a word. That begs the question, "who cares"?
The opponent is forced to pick 50% rock, but he has no limitations other than that. If you went 100% paper, you'd not beat me playing 50% rock by more than a tiny sliver of a percent it takes me to realize you are going 50% paper.
And the problem has nothing to do with the someone not just picking rock, but doing so in a very predictable manner. Otherwise, we'd not be talking about someone picking rock 50% of the time, but playing against someone that plays randomly, but tells you what he is picking half the time, never lying. That's a very, very boring brainteaser.
The first naive approach that doesn't assume idiocy on the other side's part is to assume he'll try to guess our play. If we play in a non deterministic fashion, his best options is probably random, but instead of .33/.33/.33, we get .50/.25/.25. Against an opponent doing that, you could go .25/.50/.25, and win quite a bit.
If he is forced to get at least 50% rock after a certain number of plays, then we can change our probabilities depending on how far from the standard distribution our opponent is: For instance, if we were playing to 10 throws, and he never picked rock in the first 5, we'd get him in the last 5, because his moves are forced. To avoid this, someone that has to reach that 50% will probably also change their probabilities after every throw, just to avoid such a 'free' run. One might even consider starting with an over 50% rock probability, because under that set of rules, having overplayed rock lets us play more optimally later, while underplaying rock leads to major losses.
Either way, a more detailed problem specification is required.
How would you measure 50%? When does the game end? If my opponent strictly goes by 50 25 25 then Yes, I'd be happy yo play for an infinite amount of money, however given the play I am going to be playing, my opponent may opt/adapt for 50 0 50 instead and we are stalemate again .. So the game is rigged, conditions flawed, how would you measure those 50%?
Oooh, not rock again!!!!
actually, as each choice beats one and either ties or is beaten by the other two, the odds of winning any random round of RPS is 33%.
But the whole point is that it is not random, since your opponent's choices are constrained. He is forced to play rock 50% of the time. If he plays randomly the other 50%, then you can always play paper and win 2/3rds of the time, lose 1/6 of the time, and tie 1/6th. But then he can adapt, and play rock 50% and scissors the other 50%, resulting in a tie (1/3 win, 1/3 loss, 1/3 tie). But you can adapt to that by playing rock 50% and paper 50%. You will win 50%, lose 25%, and tie 25%.
I've been having students in my introductory programming courses work on this class of problem for a few years.. They all seem to really enjoy it. I code up bots to play RPS with certain biases just like the OP and they have to program a single player that identifies the bias in an opponent and adjusts its play to give it an advantage. They all routinely can generate solutions that perform far better than random against predictable, dumb bots, but things get very interesting when I throw the students' bots against each other in a throwdown tournament. :)
From TFA: "At the start of each round an independent judge flips a fair coin and tells your opponent the result but does not tell you. If the coin came up heads your opponent must play rock."
The opponent isn't forced to get at least 50% rock after any number of plays.
Though of the rounds where there is a winner the odds are 50/50.
With no such restriction, random choices on both sides lead to 33% win, 33% draw, 33% loss, right? With the opponent throwing Rock 50% of the time, assuming the other 50% is evenly divided between Paper and Scissors, if I always throw paper I'll win 50% of the time, lose 25% of the time, and draw 25% of the time.
So depending how the betting works, I'd be pretty willing.
No, you are describing another game where the opponent is forced to play rock 50% of the time, paper 25% of the time and scissors 25% of the time.
This game is different.
100% paper strategy will win 50% of the time. Of the remaining 50% of games played, (assuming even distribution of the remaining picks) 25% will be losses and 25% will be tied
No, you are describing another game where the opponent is forced to play rock 50% of the time, paper 25% of the time and scissors 25% of the time.
This game is different.
(I'm repeating my self from another post but many people are making the same mistake)
If i know the number of rounds in advance (atleast your oponent has to right if he needs to calculate the 50%). I would recalculate the odds after every round taking into account the number of rock move remaining for him...
Like drawing to an inside flush, an "optimized" strategy is not necessarily what the opponent plays. There is no reason inherent in the description to make assumptions about the opponent's other play. They may also be constrained to play paper the other fifty percent of the time, and to play paper , then rock, then paper, then rock. In the real world, don't assume that the minimal description of the problem gives all the important data.
But then he adapts by always playing paper when he doesn't have to play rock. Now you win 25%, tie 50%, lose 25%, and you're back to even up.
there is a factual problem with the summary.TFA says it all so better read it. If not read this (I hope not to mess up it too much).
It is not required of the opponent to play rock 50% of the time. The referee is using a fair coin to determine if your opponent is to play rock or not. If s/he is not forced to play rock s/he is free to chose allowing him also to chose rock 100% of time too if s/he so wishes.
"Never go in against a Sicilian when death is on the line!"
Actually now that I think about it more
... you realize that you're doing the submitter's homework?
Required reading for internet skeptics
Lets call the guy with the restriction player 1 and the other player 2.
If you think about it player 1 got 3 "pure" strategies (as in: each other strategy he can play can be seen as a mixture of these 3):
(1) rock 100%,
(2) paper 50%/rock 50% and
(3) scissors 50%/rock 50%.
Against (1) rock gives 0, paper -1 and scissors 1.
Against (2) rock gives 1/2, paper -1/2 and scissors 0.
Against (3) rock gives -1/2, paper 0 and scissors 1/2.
In each case, the number is the probability of player 1 winning minus player 2 winning.
We see that player 2 should not play scissors, because he will never gain anything from it and clearly the optimal strategy should be able to gain something (we will see that it is 1/6). Then, knowing that, paper 50%/rock 50% is better than rock 100% for player 1: If player 2 plays rock, player 1 gets 1/2 (instead of 0) and if player 2 plays scissors player 1 gains -1/2 (instead of -1).
Hence, we are down to:
(2) paper 50%/rock 50% and (3) scissors 50%/ rock 50% vs. rock and paper. If player 1 plays (2) with pr. 1/3 and (3) with probability 2/3, he loses 1/6 against both rock and paper. If player 2 plays rock with probability 1/3 and paper with probability 2/3 he gets 1/6 against (1) and 1/6 against (2). This is optimal since each player have a way to guarantee that player 1 loses 1/6 and player 2 wins 1/6. If either had a better strategy it would break the other players guarantee (note that the given strategy for player 1 wins 1/6 against scissors, again showing that it is a bad strategy for player 2 and player 2's strategy wins 1/3 against rock 100% showing that it is a bad strategy).
Over the long term, the strategy must converge to stable, therefore true random can be the only optimized strategy.
50% of time opponent must play R. The remaining 50% of the time they can equally choose R,P,S.
R -> P = 1/2 + 1/3 * 1/2
S -> R = 1/3 * 1/2
P -> S = 1/3 * 1/2
For a guaranteed win, roll a die: 1-4 => P, 5 => R, 6 => S.
By how much? Consider you are random as above and opponent is fixed wlog at 100% R. You win 2/3 of the time and lose 1/6 of the time.
The expected payoff to play is 2/3, the expected cost is 1/6. Payoff - cost = 1/2, so the most you should be willing to pay to play a $100 game is $150.
You should play paper 4/6 of the time, rock 1/6, and scissors 1/6 of the time.
The key (if you RFTA) is that whether or not your opponent plays rock is determined by a coin toss. So really you are playing a compound game. You are playing a coin toss and rock paper scissors (RPS). Since the coin toss determines your opponents move, you can think of it as playing 50% coin toss and 50% RPS. The RPS is a subgame of the coin toss.
Since the coin toss is the dominate game, you play with win that first. But instead of heads/tails, it is paper/other. The answer to the coin toss is a 50/50 guess of heads/tails, so the answer to the paper/other is 50% paper, 50% other.
The "other" is the RPS game. And since the answer to the RPS game is 1/3 rock, 1/3 paper, 1/3 scissors, we know what the solution to the other 50% of the game is.
So the equations are:choice = (Coin Toss) + (RPS) so: paper = 1/2 + 1/3, rock = 0 + 1/3, scissors = 0 + 1/3. Or paper = 4/6, rock = 1/6, scissors = 1/6.
First, make sure you read TFA, since it explains what the summary doesn't: how the 50% is determined and how the opponent can play in the non-forced turns.
If you play using a deterministic algorithm, for example always play paper, the opponent can figure it out and beat you on all the non-forced turns. At best you'll get an even result.
If you play using a random algorithm, the opponent can figure out the frequencies you're using and compensate for that. For example, if you decide to play paper 50% of the time and rock and scissors 25% of the time, you'd win against an opponent playing rock 50% of the time and paper and scissors 25% of the time. However, if the opponent decides to play rock 50% of the time and scissors the other 50%, the result is even again. If the opponent would be forced to play rock more than 50% of the time, there is no room to compensate and you would win consistently with 100% paper. I think that with 50% rock, there is enough room to respond to any frequency distribution you can come up with, although I have no proof for that.
You could change your algorithms during play, but if there isn't any algorithm that results in an advantage when playing it consistently, gaining an advantage from changing your algorithm would depend on how well your opponent responds to your changes. In other words, you're playing mind games. I don't think the 50% rock restriction is going to be of any help here.
Is it harder than spelling seems?
at his head
...but...but... but it's interesting homework! :-)
Assorted stuff I do sometimes: Lemuria.org
You seriously think that each player in RPS has a 33% chance of winning each round? Think a little bit about that. Oh, I forgot, this is /.
What do you think the odds are for each player? Keep in mind that there are three possible outcomes for each round: win, lose, or draw.
"Convictions are more dangerous enemies of truth than lies."
The Nash Equilibrium is for you to play paper 2/3rds of the time, and rock 1/3rd. His best counter strategy is to play rock 50% (he cannot go lower) and scissors 50%. He cannot do better. If you deviate from 2/3 paper and 1/3 rock, he can adjust his strategy to do better. With the optimal strategy, you will win 1/2, lose 1/3, and tie 1/6.
Here is my search for the Nash Equilibrium:
#include
struct rps {
double rock;
double paper;
double scissors;
};
static double
eval(struct rps *a, struct rps *b)
{
return
(a->rock * (b->scissors - b->paper)) +
(a->paper * (b->rock - b->scissors)) +
(a->scissors * (b->paper - b->rock));
}
int
main(void)
{
struct rps you;
struct rps him;
him.rock = 0.5;
double worst_best_eval_for_him = 1.0;
double best_rock_for_you = 0;
double best_paper_for_you = 0;
double worst_best_paper_for_him = 0;
double dx = 0.001;
for (you.rock = 0; you.rock best_eval_for_him) {
best_eval_for_him = p;
best_paper_for_him = him.paper;
}
}
if (worst_best_eval_for_him > best_eval_for_him) {
worst_best_eval_for_him = best_eval_for_him;
best_rock_for_you = you.rock;
best_paper_for_you = you.paper;
worst_best_paper_for_him = best_paper_for_him;
}
}
}
printf("worst_best_eval_for_him = %f\n", worst_best_eval_for_him);
printf("best_rock_for_you = %f\n", best_rock_for_you);
printf("best_paper_for_you = %f\n", best_paper_for_you);
printf("worst_best_paper_for_him = %f\n", worst_best_paper_for_him);
return 0;
}
Sorry, but Slashdot mangled that code badly because of the angle brackets.
Sorry, but Slashdot mangled that code badly because of the angle brackets.
Let me try again:
#include <stdio.h>
struct rps {
double rock;
double paper;
double scissors;
};
static double
eval(struct rps *a, struct rps *b)
{
return
(a->rock * (b->scissors - b->paper)) +
(a->paper * (b->rock - b->scissors)) +
(a->scissors * (b->paper - b->rock));
}
int
main(void)
{
struct rps you;
struct rps him;
him.rock = 0.5;
double worst_best_eval_for_him = 1.0;
double best_rock_for_you = 0;
double best_paper_for_you = 0;
double worst_best_paper_for_him = 0;
double dx = 0.001;
for (you.rock = 0; you.rock < 1.0; you.rock += dx) {
for (you.paper= 0; (you.paper + you.rock) < 1.0; you.paper+= dx) {
you.scissors = 1.0 - you.rock - you.paper;
double best_paper_for_him = 0.0;
double best_eval_for_him = -1.0;
for (him.paper = 0; him.paper < 0.5; him.paper += dx) {
him.scissors = 1.0 - him.rock - him.paper;
double p = eval(&him, &you);
if (p > best_eval_for_him) {
best_eval_for_him = p;
best_paper_for_him = him.paper;
}
}
if (worst_best_eval_for_him > best_eval_for_him) {
worst_best_eval_for_him = best_eval_for_him;
best_rock_for_you = you.rock;
best_paper_for_you = you.paper;
worst_best_paper_for_him = best_paper_for_him;
}
}
}
printf("worst_best_eval_for_him = %f\n", worst_best_eval_for_him);
printf("best_rock_for_you = %f\n", best_rock_for_you);
printf("best_paper_for_you = %f\n", best_paper_for_you);
printf("worst_best_paper_for_him = %f\n", worst_best_paper_for_him);
return 0;
}
There are a fair number good ideas here that are getting close to the correct solution here, but there is also a lot of stuff that is pretty far off so I'll be posting a video solution to my youtube channel for this Monday. If your interested in the solution you can subscribe to my youtube channel and you'll see it when I post it on Monday :)
https://www.youtube.com/channe...
I think you have an error in there. His best strategy in your example would be to always use paper, in which case he's doing the exact same thing as you are, and so the odds are even.
I think your best strategy would have to involve reacting to his. Start with 100% paper until you see what he's throwing when he's not throwing rock. This gives you at least even odds regardless of what he does.
He'll probably start throwing scissors, at which point you switch to you switch to 50% rock 50% paper. This will have you win 2/4, lose 1/4, draw 1/4.
He should respond by starting to throw paper. so you just go back to 100% paper and win half the time. If he switches back to scissors, you start throwing rock again.
The goal is to use long stretches of paper to for him into 50R:50S (which only breaks even at best), and then switch to 50R:50P to pick up some wins until he switches back 50R:50P (again, he'll only be able to break even). So when he's properly countering you, he can only break even, and every time you switch strategies you should pick up some extra wins. If he tries to go for a midpoint (50:25:25) he'll lose to either of your two strategies.
Since I already explained the optimal solution to the basic question mentioned in the summery lets solve the bonus question too (my solution also matches the solution given in comments on the article side so it should be good (and said to be correct by the author) - note currently no answer with a high score is correct - mine has 1).
The bonus question is that you play two rounds, and your oppoent must play atleast rock once. So, if he plays something not rock in the first round he must play rock in the second and loss (you just play paper). If he plays rock in the first he can play 1/3 all in the second (which leads to a draw like normal). Thus, if he plays rock first it is like normal RPS (because he get 0 in the next). Otherwise you get one free win (for the second round).
Thus, we can model the first game of the bonus question as (where the numbers is the number of rounds he wins on avg given the choice in round 1):
R P S
R 0 -1 1
P 0 -1 -2
S -2 0 -1
Where you pick columns and him rows. We see that rock dominates paper for the row player. We get
R P S
R 0 -1 1
S -2 0 -1
For the column player, the choice of rock now dominates scissors. We get
R P
R 0 -1
S -2 0
Playing rock 1/3 and paper 2/3 for the collumn player gives -2/3 wins on avg. Similarly, the row player can get -2/3 wins on avg by playing rock 2/3 and scissors 1/3.
To figure out what you should do, first assume your opponent is rational, and will make good choices whenever he is able. Since he knows that you will play a paper-heavy strategy to counter his rock-heavy strategy, it would not be rational to voluntarily choose more rocks. That could only make things worse.
But if he tried to exploit your paper-heavy strategy by throwing scissors on turns when he gets a choice, you'd have a perfect strategy against this: All rock. On forced rock, you get a redo, and on non-forced rock, he does scissors and you win. So on first pass, I think that your opponent should favor paper when he can choose. It's not like you'll ever do scissors. That's auto-lose half the time - basically a complete surrender of your advantage. So paper is a safe move for your opponent.
The problem is that if he did 50% rock and 50% paper, then all-paper will be your perfect counter. He won't let you do that, so he'll have to throw in some scissors. Just how many? It looks now like you will both simultaneously have to determine optimal strategies in order to answer that question, and this will require derivatives of two sets of optimal-choice equations - so that you can solve for the two maxima. Sounds like a fun problem!
Of course. You use the rock to smash him in the head while he tries to stab you with the scissors. Your friend then uses the paper to write a letter to your parents about how you died in a stupid fight about statistics.
Get free satoshi (Bitcoin) and Dogecoins
... is coming monday on the author's youtube channel https://www.youtube.com/channe... -- he commented with this but is evidently bad at slashdot so I'm hopping this comment gets read even if his doesn't ...
@AlexSheive
It says that rock *must* be thrown 50% of the time. Not that there is a 50/50 chance of the choice being rock for each game. So the number of games *has* to be set out before gaming commences. As you draw closer to the last game, the odds will change to how likely rock will be thrown. This is very similar to counting cards in a blackjack.
"When life gives you lemons, don't make lemonade. Make life take the lemons back!" -- Cave Johnson
If you play 2/3 paper 1/3 rock. He will play 1/2 paper, 1/2 rock.
Wins for you:
Paper vs rock: 2/3 * 1/2 = 1/3 win
Rock vs scissors: 1/3 * 0 = 0 win
Scissors vs paper: 0 * 1/2 = 0 win
For him:
Paper vs rock: 1/2 * 1/3 = 1/6
Rock vs scissors: 1/2 * 0 = 0
Scissors vs paper: 0 * 2/3 = 0
Your optimal strategy (2/3 paper, 1/3 rock) vs his optimal strategy (1/2 paper, 1/2 rock), results 1/3 win not a 1/2 win.
His nash strategy (minimax strategy) is to play rock 1/2, scissors 1/3 and paper 1/6 (it is true that rock 1/2 and scissors 1/2 is a best response to the nash strategy, but that does not mean it is optimal against abitrary strategies). I have a long earlier post showing the strategies and so on :)
Yes, you are right - paper will be 25% tie, 25% lose, ending up with 50% tie, 25% win, 25% lose, so still purely random.
I don't think that strategy we are looking for involves "reactions" - this can be always defeated by opponent which overguess you by one step. I would hope to find a strategy which would lead you to win > 50% even if opponent knows it (strategy itself, not the result of random choice at given stage).
A hem! I'm a frayed knot.
Ok, my current guess would be around 1/3 rock and 2/3 paper. Opponent cannot replicate this strategy, because of requirement of having at least 50% rock. If he goes 50% scissors, he will have 1/6 tie, 2/6 lose, 1/6 lose and 2/6 win, so 3:2 in my favor. With 50% paper, he gets 1/6 tie, 2/6 lose, 1/6 win, 2/6 tie, so 2:1 in my favor. 100% rock is obvious lose. I think that mixing scissors and paper will be clearly worse than pure scissors on his part (because with scissors, he has at least 2/3 chances of winning after he got a choice).
From random testing, it seems that optimal ratio of rock is around 38% (not 1/3). Around that point, pure paper and pure scissor strategies for opponent seems to get equal and I'm winning around 62% of cases.
There is probably some interesting math behind that...
doesn't include a consideration of my lifespan. If I only live 100 years my opponent can play scissors every single game and play rock continuously after I am dead and the restriction would still be satisfied. In other words, "having to play rock 50 percent of the time" doesn't give you any relevant information about your opponent's behaviour.
I'm getting better test results from having around 38.4% rock, rest paper. At this point his scissor and paper strategies become equal and I'm winning (assuming ties are repeated) 61.7% of cases instead of 60% of cases with 1/3 rock rest paper.
I think that trick is finding a place where any mixture of scissors or paper for him is equally bad. At this moment, his choices doesn't matter anymore (of course, unless he choses rock even at free choice). At 1/3 rock 2/3 paper, scissors are clearly better choice for him, so it is not a perfect place.
Nobody seems to be asking what the definition of "50% of the time" is ?
Over what interval/time frame?
If the opponent throws scissors or paper on his first move is he then required to throw rock on his second move?
Probably because they read the link which describes how "50% of the time" is determined.
Opponent throws rock more than he wants to => You shouldn't throw scissors at all (assumption)
Let your optimal throw be according to probability { R: 1-x, P: x, S: 0 }
Opponent doesn't want to throw R at all in this case; he should respond by matching optimal probabilities on what he can choose: { R: 1/2, P: (1-x)/2, S: x/2 }
You continue to assume that you shouldn't throw any scissors; you respond by matching optimal probabilities on R and P: { R: x/(x+1), P: 1/(x+1), S: 0 }
Since your throw probabilities are optimal 1-x = x/(x+1) and x = 1/(x+1)
Guess what? x = 1/(x+1) is the definition of the golden ratio conjugate! (x = 0.618034) And it works in the first equation as well .5, P: 0.190983, S: 0.309017 }
Therefore your optimal throw is according to probabilities { R: 0.381966, P: 0.618034, S: 0 }
Therefore your opponent throws according to probabilities { R:
You will win (2x-x^2)/2 = 0.427051 of the time, lose (1-2x+2x^2)/2 = 0.263932 of the time, and draw (1-x^2)/2 = 0.309017 of the time. When draws are dropped you win 0.618034 (x) of the time.
I believe this is the optimal solution (but I still have to confirm the above assumption)
Ok, Nash equilibrium, finally the correct approach. But here's an interesting variation, seemingly very similar, but a bit harder: same probabilities, but introduce a third side with a predefined algorithm and infinite budget. The game still looks the same to you as the player: 50% rock, 50% human RPS player, but in fact suddenly the game depends on so many details...
The setting: you play RPS against a human opponent, but do not communicate with them directly - the interface is operated by the third side and only shows the choices (no chatting, etc). Your opponent is free to choose as he wishes. The operator intervenes with 50% chance (fair coin toss) by replacing your opponent's choice with rock (replacing rock with rock still counts as intervention). Of course the fact of intervention is not communicated to players.
Scoring rules: On "normal" rounds you and your opponent score against each other as in normal RPS. On "intervention" rounds you both score against the operator's budget. Your score depends on your choice vs. operator's rock. Your opponent's score depends on his original choice vs. your choice (so, yes, in this game you can both win/lose simultanously).
1. What is your best strategy in this game?
2. More interestingly: is the best strategy different if your opponent does not know about the existence of interventions (they change nothing visible on his side after all)? If you could tell him about it before the start of the game, would you?
That's my main question, but for the really bored, some more options. Consider a symmetric variant, where your choices as seen by your opponent also get replaced with rock randomly. Does the strategy depend on the operator's algorithm: option 1 - single throw decides no replacement vs. replacement in both directions, option 2 - separate throws for each direction, option 3 - always replace, single throw determines direction?
And what if your choices get replaced with something different than your opponents'? He 50% rock vs. you 50% paper, or he 50% rock vs. you 50% scissors?
And the most difficult one: review all these variants under "no ties" rule - in case of tie you replay the round, but the coin is not thrown again - if there was an intervention it will happen again, if there wasn't, it won't. This is a difficult variant - the same game might be a tie-resolving round for one player and a start of a normal round for the other, so you're never really sure whether your next round will get a throw. You will no longer see 50% rocks, the actual proportion depends on strategies of both players.
Man, coming up with such variations is fun. And details sometimes matter. Adding a side channel for communication between players also might change everything, as you can then cooperate against the operator... And from the player's point of view all these variants technically still fit within the bounds of the same simplified description: random choice of 50% rock, 50% human RPS player...
You kick him in the testicles and let him keep the chicken.
Get. A. Life.
Oh, dear, trolls must be starving now. Slashdot is not what it was any more.
As some have said, the optimum is a mixed strategy of playing paper 2/3 of the time and rock the rest of the time.
Let's say we're player 1 and they're player 2. The way to calculate this exactly is to first observe that the general form for each players mixed strategy is:
P(r1) + P(p1) + P(s1) = 1
1/2 + P(p2) + P(s2) = 1
where, for example, P(r1) is the probability of a rock being played by player 1.
This means that for a given P(r1), P(p1) and P(p2),
P(s1) = 1 - P(r1) - P(p1) [1]
P(s2) = 1/2 - P(p2) [2]
Now observe that the probability of player 1 winning, P(W), is the sum of the probabilities of each of the winning configurations occurring,
P(w) = P((r1 and s2) or (p1 and r2) or (s1 and p2))
= P(r1 and s2) + P(p1 and r2) + P(s1 and p2)
= P(r1)*P(s2) + P(p1)*P(r2) + P(s1)*P(p2)
After substituting from [1] and [2] we get the following,
P(W) = P(r1)*(1/2 - P(p2)) + P(p1)*(1/2) + (1 - P(r1) - P(p1))*P(p2)
Let's make this easier to read:
x = P(r1)
y = P(p1)
z = P(p2)
So then we have:
P(W) = x*(1/2 - z) + y/2 + (1 - x - y)*z
Similarly, we find that the probability of losing, P(L), is:
P(L) = x*P(p2) + y*(1/2 - z)+ (1 - x - y)/2
Now if we gain by 1 for winning, gain by -1 for losing (aka make a loss by 1) and nothing happens when we draw, then the first players expected gains, F, is given by:
F(x, y, z) = (1)*P(W) + (0)*P(D) + (-1)*P(L) = x + y/2 - 1/2 + z*(1 - 3*x)
[Note that we didn't bother finding the probability of drawing because it has no effect on the value of the expected gains]
Now, recall that player 1 controls x and y (the probability of player 1 playing rock and paper respectively) and player 2 controls z (the probability of player 2 playing paper). Be aware that x and y can at least be zero and at most be 1 minus the other. Also note that z can only range from 0 to 1/2 (because the other half of the time player 2 must throw a rock). Finally, you should understand that player 1 wants to maximize F and player 2 wants to minimize F.
From this equation it is clear that when x is less than or equal to a 1/3, player 2 should minimize z in order to minimize F. So in this case,
F(x, y, z) = F(x, y, 0) = x + y/2 - 1/2
and player 1 would maximize this by setting x = 1/3 (as was assumed for this case) and y = 0 which gives F(1/3, 0, 0) = 1/3 - 1/2 = 1/6.
If on the other hand x were set to a probability greater than or equal to 1/3, player 2 would want to maximize z. This would give,
F(x, y, z) = F(x, y, 1/2) = x + y/2 - 1/2 + (1/2)*(1 - 3*x) = (y - x)/2
Player 1 would maximize this by minimizing x (x = 1/3) and maximizing y (y = 2/3 as x + y = 1),
F(x, y, z) = F(1/3, 2/3, 1/2) = ((2/3) - (1/3))/2 = 1/6.
Finally it's nice to observe that when player 1 sets x = 0, the second player's choice of z has no effect on the gains function whatsoever.
So, in conclusion, the optimal strategy for player 1 is x = 1/3 and y = 2/3 where x is the probability of player 1 playing rock and y is the probability of player 1 playing paper.
How would the opponent overguess you? You can delay your strategy switch for as little or as much time as you like, since their best strategy only breaks even.
If they don't know what you're doing, you could rack up more wins by exploiting flaws in their strategy, and rack up even more wins when they "wise up" and try to counter you. I guess the difference between our answers is that I'm treating this as an actual game against another person, whereas you're looking for more a game theoretical optimum.
Now, if your opponent does have perfect knowledge of your strategy, then randomly alternating between the strategies would be the best approach, which is effectively the same as your 1/3 rock, 2/3 paper idea. By the way, I ran some sims in matlab just now, and I'm getting the optimum to be bang on 1/3-2/3, so the math might not be so interesting after all :-(
There are only two degrees of freedom: the probability that we pick paper (assume we pick rock the rest of the time -- picking scissors should never make sense); and the probability that our opponent picks scissors when they have the choice. I'm assuming that our opponent never intentionally chooses rock. That seems fair since our plan involves throwing paper more often than not.
Making a meshgrid of those two probabilities in 1% increments, running 100k games at each point, and then taking the minimum along the axis of our opponent's choice (i.e. assuming they always pick the best strategy), I find a peak at 68% with us having a 16% advantage, which is about 1-in-6. So it looks like your initial guess in the other post was correct.
In any given round of RPS, one particular throw will win that round and two will not. That looks like 1 out of 3 to me, but of course I'm no mathematician!
You're being pedantic. He's not picking an number between 1 and 10^10^10, he's picking a number between 1 and 3.
Rock Paper Scissors can be redefined as:
Both players secretly choose a number, {0, 1, 2} (I have chosen to 0 index for modular logic). Let A be the value Alice picks, and B be the value Bob picks.
The winner, W, is evaluated by performing:
W = (A - B) % 3
Alice wins if W = 1
Bob wins if W = 2
Tie if W = 0
It is easy to show that, when the probability of each possible choice in A is evenly distributed, then W is also independent and evenly distributed, regardless of B; and likewise if choice B is evenly distributed then A is irrelevant to W's probability distribution.
I can't believe I have to make this point on slashdot, that general rock paper scissors between two rational people has a 1/3 chance of winning, 1/3 chance of losing, and 1/3 chance of tying, per iteration.
So if anybody plays randomly, the odds are absolutely 33% (rounded to the nearest percent). The interesting part of this brain teaser is that both players know that one player is compelled toward non-even distribution.
In this game, you should play paper 67% of the time, and rock 33% of the time. Instead of rehashing to formal proof that is evidently coming on monday, I'm just going to break it down intuitively. Evidence:
Scissors is clearly not something you should ever play, because it's going to lose at least half the time, so a scissors play cannot break even on average, so we can eliminate it from consideration.
That makes things easy: we just have to determine the correct ratio of rock to paper. Clearly all rock is a losing strategy because it's trivially defeated by a 50% rock 50% paper strategy. Clearly all paper nets you 0 because it's matched by a 50% rock 50% scissors strategy.
Note how shifting from a rock to a paper strategy causes the adversary to switch from a paper to a scissors strategy. But an opponent who uses scissors is one who is more vulnerable to rock! But switching to rock increases the opponent's use of paper, which you cannot counter because scissors are unacceptable.
Consider a 90% paper, 10% rock strategy. Clearly 50% rock and 50% scissors fails here. Assuming $100 / iteration, average results:
When we use paper, we win 50% of the time, and lose 50% of the time. Edge = $0.
When we use rock, we win 50% of the time, and lose 0% of the time. Edge = $50
Overall Edge = Edge(paper) * 90% + Edge(rock) * 10% = $5
We've gained an edge by eliminating 10% of all failures. But maybe the adversary should actually be using 50% rock, 45% scissors, and 5% paper, in order to keep the scissors/paper ratio in sync with our paper/rock strategy. Then:
When we use paper, we win 50% of the time, lose 45% of the time, tie 5% of the time. Edge = $5
When we use rock, we win 45% of the time, lose 5% of the time, and tie 50% of the time. Edge = $4
Overall Edge = Edge(paper) * 90% + Edge(rock) * 10% = $4.90
Well, the adversary did a bit better, but they're still losing.
Well, maybe there's an even better strategy for the adversary?
Short answer, no, there is not. Yeah, I didn't prove it, but that's a simpler problem because the strategy you are optimizing against is completely defined. Slide the scales around a bit, and you find a global optimum at 2/3 paper 1/3 rock, which the opponent counters with 1/2 rock, 1/3 scissors, 1/6 paper.
You are right! :-)
Currently hooked on AMP
I like this post, because its being snarky at someone for not reading things fully while at the same time failing to notice that someone else has already posted the exact same response to GP as this one.
Since he can expect you to tthrow paper pretty often, he'd want to make scissors take up at least some of the non-rock ones he can throw, so if you throw rock, he would lose and if you were countering one of his rocks, you'd negate.
I don't know the meaning of the word 'don't' - J
Thanks for that. If Slashdot used an RTF SYSTEM LIKE THE REST OF THE CIVILISED WORLD, your angle brackets would have been no problem....
Shoes for Industry. Shoes for the Dead.
As teacher of C, nice code. Lack of comments, though, -20%.
Ideone. Codepad. Any number of pastebins.
Based on @ShanghaiBill's solution, I wrote a solver and simulation in Python (that also fixes ShanghaiBill's buggy pinning of "him.rock" to 0.5 - the player could in theory, choose to play rock at more than 50% probability). Use Pypy for speedy execution - I uploaded the code to GitHub: https://github.com/ttsiodras/R...
Let r, p and s be the probabilities we play rock, paper and scissors on a given game respectively. It follows that,
r + p + s = 1
where r, p and s are elements of the interval [0,1]
For our opponent we will use a similar notation, only with uppercase letters,
R + P + S = 1
where R is an element of [1/2, 1] and p and s are elements of [0, 1/2]
Rearranging, we find that,
s = 1 - r - p
S = 1 - R - P
Now the expected gains, G, we get from any one game is given by,
G = AW + 0D + (-A)L = A(W - L)
where A is the amount exchanged between players and W, D and L is the probability of winning, drawing and losing respectively.
The probability of winning on single game, W, is given by the sum of the probabilities of each of the winning configurations occurring is
W = rS + pR + sP
Similarly the probability of losing on a game is
L = Rs + Pr + Sp
and so the expected gains is given by
G = A( r - R + P - p + 3(pR - Pr) )
Now note that our opponent must play rock at least half the time, so whenever we play scissors, at minimum we will lose half of the time and so it will never improve our gains. Therefore we should never play scissors. In the worst case scenario, our opponent is aware that we will never play scissors and that playing rock more than he has to will never improve his gains. This means
R = 1/2
G = A( r + (p-1)/2 + P(1-3r) )
Assuming the opponent is aware of this, he will do what he can to minimize G. This would mean that if (1-3r) is negative then he would maximize P (the only variable he controls). Similarly if (1-3r) is positive, he will minimize P. Finally if (1-3r) is 0, he cannot affect G.
In all of these scenarios, if we set r = 1/3 (or at least as close as possible to it) we will maximize G. This means that we should play rock a third of the time and paper the rest of the time. Our expected gains per game is G = A/6 so if A = $100 we should only be willing to pay $16.66 or less if we wish to make any gains per game.
Bullshit..
Any answer that eliminates one of the three options for any of the players, have not been thought through.
If you eliminate one option, the opponent will have optimal strategy guaranteeing no loses.
Any optimal playing strategy will need a percentage of all threes.
Btw, the question is a teaser. There is not optimal solution as there is no equlibrium, any chosen strategy will have an answer by the opponent that makes it sub optimal.
Then you fall over dead, because he spent the last few years building up an immunity to iocane powder.
Use the Kobayashi Maru solution.
Star Trek transporters are just 3d printers.
He could throw whatever he likes in the short term, and claim he was going to make up the 50% rock in the long term. How would you disprove this?
Or you could just knock him out and take his wallet, amirite? Stupid eggheads with their book-learnin'.
Yo dawg, I heard you like the Ackermann function, so OH GOD OH GOD OH GOD
Solution with some explanation of game theory concepts, by the article author: https://www.youtube.com/watch?...
then on the 1st round rock is 50:50; but rock is thrown then on the 2n round they can't throw rock again (that would be 100% rock when they _MUST_ throw rock 50% of the time); or if they didn't throw rock on the 1st round then they'd have to throw rock on the 2nd round, etc. ;-)