In game theory, a player's strategy is any of the options that can be chosen in a setting where the pay-off depends not only on the player's action but on the action of every player.
Consider two prisoners held in separate cells, interrogated simultaneously, and offered deals (lighter jail sentences) for betraying their fellow criminal. The prisoner's strategy can either be to "cooperate" or "defect".
Prisoner 1, Prisoner 2 | Co-operate (with other) | Defect (betray other) _____________________________________________________________________________ Co-operate (with other) | 1 , 1 | 4 , 0 Defect (betray other) | 0 , 4 | 2 , 2
A strategy profile is a set of strategies for all players which fully specifies all actions in a game. A strategy profile must include one and only one strategy for every player. In the case of the prisoners, there are four possible strategy profile as shown in the table cells above.
A pay-off is usually associated with every strategy profile. In the case of the prisoners, if they both co-operate, they both get a pay-off (prison sentence) of 1 year.
A strategy profile is a Nash equilibrium if no player can do better by unilaterally changing only his or her strategy. Thus, each strategy in a Nash equilibrium is a best response to all other strategies in that equilibrium.
The Nash equilibrium depends on if the objective is to minimize or maximize the pay-off matrix:
If the players intend to minimize their prison sentence, then the only Nash equilibrium occurs when both prisoner choose to defect. If any of the prisoner unilaterally attempts to deviate from this strategy profile, he gets a higher prison sentence of 4 years.
If the players intend to maximize their prison sentence, then the only Nash equilibrium occurs when both prisoner choose to co-operate. If any of the prisoner unilaterally attempts to deviate from this strategy profile, he gets to walk free.
For a given two-player game with each player having n strategies, the pay-off matrix P is given as a n x n cell array, which elements specifies a strategy profile and the corresponding pay-offs. In the case of the prisoners
P = {[1, 1] [4, 0]; [0, 4] [2, 2]}
Given the pay-off matrix P and an objective (min or max), return the 2D index of all the Nash equilibria (i.e. the strategy profile) in a k x 2 matrix and the corresponding pay-off in a column cell array.
could you please break down this problem's testsuite (using "%%" double-comments) to make it a bit easier for players to spot issues in their code? thanks (and very nice problems btw!)
363 Solvers
196 Solvers
Find best placement for ordered dominoes (harder)
194 Solvers
Getting the row and column location from a matrix
192 Solvers
328 Solvers