Đồ họa ứng dụng - What is a game

210 trang vanle 3210

Download

Bạn đang xem 20 trang mẫu của tài liệu "Đồ họa ứng dụng - What is a game", để tải tài liệu gốc về máy bạn click vào nút DOWNLOAD ở trên

Tài liệu đính kèm:

do_hoa_ung_dung_what_is_a_game.pdf

Nội dung text: Đồ họa ứng dụng - What is a game

What is a Game? • There are many types of games, board games, card games, video games, field games (e.g. football), etc. • In this course, our focus is on games where: – There are 2 or more players. – There is some choice of action where strategy matters. – The game has one or more outcomes, e.g. someone wins, someone loses. – The outcome depends on the strategies chosen by all players; there is strategic interaction. • What does this rule out? – Games of pure chance, e.g. lotteries, slot machines. (Strategies don't matter). – Games without strategic interaction between players, e.g. Solitaire.
Why Do Economists Study Games? • Games are a convenient way in which to model the strategic interactions among economic agents. • Many economic issues involve strategic interaction. – Behavior in imperfectly competitive markets, e.g. Coca-Cola versus Pepsi. – Behavior in auctions, e.g. Investment banks bidding on U.S. Treasury bills. – Behavior in economic negotiations, e.g. trade. • Game theory is not limited to Economics.
Five Elements of a Game: 1. The players – how many players are there? – does nature/chance play a role? 2. A complete description of what the players can do – the set of all possible actions. 3. The information that players have available when choosing their actions 4. A description of the payoff consequences for each player for every possible combination of actions chosen by all players playing the game. 5. A description of all players’ preferences over payoffs.
The Prisoners' Dilemma Game • Two players, prisoners 1, 2. • Each prisoner has two possible actions. – Prisoner 1: Don't Confess, Confess – Prisoner 2: Don't Confess, Confess • Players choose actions simultaneously without knowing the action chosen by the other. • Payoff consequences quantified in prison years. • Fewer years=greater satisfaction=>higher payoff. • Prisoner 1 payoff first, followed by prisoner 2 payoff.
Prisoners’ Dilemma in “Normal” or “Strategic” Form Prisoner 2↓ Don't Prisoner 1↓ Confess Confess Don't 1,1 15,0 Confess Confess 0,15 5,5
How to play games using the comlabgames software. • Start the browser software (IE or Netscape). • Enter the URL address provided on the board. • Enter a user name and organization=pitt. Then click the start game button. • Start playing when roles are assigned. • You are randomly matched with one other player. • Choose a row or column depending on your role.
Computer Screen View
Results Screen View Number of times each outcome has been realized. Number of times row player has played each strategy Number of times column Total Number player has played of Rounds Played each strategy
Prisoners' Dilemma in “Extensive” Form This line represents Prisoner 1 a constraint on the information that prisoner 2 has available. While Don't 2 moves second, he does Confess not know what 1 has Confess chosen. Prisoner 2 Prisoner 2 Don't Don't Confess Confess Confess Confess 1,1 15,0 0,15 5,5 Payoffs are: Prisoner 1 payoff, Prisoner 2 payoff.
Computer Screen View
Prisoners' Dilemma is an example of a Non-Zero Sum Game • A zero-sum game is one in which the players' interests are in direct conflict, e.g. in football, one team wins and the other loses; payoffs sum to zero. • A game is non-zero-sum, if players interests are not always in direct conflict, so that there are opportunities for both to gain. • For example, when both players choose Don't Confess in the Prisoners' Dilemma.
The Prisoners' Dilemma is applicable to many other situations. • Nuclear arms races. • Dispute Resolution and the decision to hire a lawyer. • Corruption/political contributions between contractors and politicians. • Can you think of other applications?
Simultaneous versus Sequential Move Games • Games where players choose actions simultaneously are simultaneous move games. – Examples: Prisoners' Dilemma, Sealed-Bid Auctions. – Must anticipate what your opponent will do right now, recognizing that your opponent is doing the same. • Games where players choose actions in a particular sequence are sequential move games. – Examples: Chess, Bargaining/Negotiations. – Must look ahead in order to know what action to choose now. • Many strategic situations involve both sequential and simultaneous moves.
The Investment Game is a Sequential Move Game Sender If sender sends (invests) 4, the Don't Send amount at stake Send is tripled (=12). 4,0 Receiver Keep Return 0,12 6,6
Computer Screen View • You are either the sender or the receiver. If you are the receiver, wait for the sender's decision.
One-Shot versus Repeated Games • One-shot: play of the game occurs once. – Players likely to not know much about one another. – Example - tipping on your vacation • Repeated: play of the game is repeated with the same players. – Indefinitely versus finitely repeated games – Reputational concerns matter; opportunities for cooperative behavior may arise. • Advise: If you plan to pursue an aggressive strategy, ask yourself whether you are in a one-shot or in a repeated game. If a repeated game, think again.
Strategies •A strategy must be a “comprehensive plan of action”, a decision rule or set of instructions about which actions a player should take following all possible histories of play. • It is the equivalent of a memo, left behind when you go on vacation, that specifies the actions you want taken in every situation which could conceivably arise during your absence. • Strategies will depend on whether the game is one-shot or repeated. • Examples of one-shot strategies – Prisoners' Dilemma: Don't Confess, Confess – Investment Game: • Sender: Don't Send, Send • Receiver: Keep, Return • How do strategies change when the game is repeated?
Repeated Game Strategies • In repeated games, the sequential nature of the relationship allows for the adoption of strategies that are contingent on the actions chosen in previous plays of the game. • Most contingent strategies are of the type known as "trigger" strategies. • Example trigger strategies – In prisoners' dilemma: Initially play Don't confess. If your opponent plays Confess, then play Confess in the next round. If your opponent plays Don't confess, then play Don't confess in the next round. This is known as the "tit for tat" strategy. – In the investment game, if you are the sender: Initially play Send. Play Send as long as the receiver plays Return. If the receiver plays Keep, never play Send again. This is known as the "grim trigger" strategy.
Information • Players have perfect information if they know exactly what has happened every time a decision needs to be made, e.g. in Chess. • Otherwise, the game is one of imperfect information – Example: In the repeated investment game, the sender and receiver might be differentially informed about the investment outcome. For example, the receiver may know that the amount invested is always tripled, but the sender may not be aware of this fact.
Assumptions Game Theorists Make 9 Payoffs are known and fixed. People treat expected payoffs the same as certain payoffs (they are risk neutral). – Example: a risk neutral person is indifferent between $25 for certain or a 25% chance of earning $100 and a 75% chance of earning 0. – We can relax this assumption to capture risk averse behavior. 9 All players behave rationally. – They understand and seek to maximize their own payoffs. – They are flawless in calculating which actions will maximize their payoffs. 9 The rules of the game are common knowledge: – Each player knows the set of players, strategies and payoffs from all possible combinations of strategies: call this information “X.” – Common knowledge means that each player knows that all players know X, that all players know that all players know X, that all players know that all players know that all players know X and so on, , ad infinitum.
Equilibrium • The interaction of all (rational) players' strategies results in an outcome that we call "equilibrium." • In equilibrium, each player is playing the strategy that is a "best response" to the strategies of the other players. No one has an incentive to change his strategy given the strategy choices of the others. • Equilibrium is not: – The best possible outcome. Equilibrium in the one-shot prisoners' dilemma is for both players to confess. – A situation where players always choose the same action. Sometimes equilibrium will involve changing action choices (known as a mixed strategy equilibrium).
Sequential Move Games with Perfect Information • Models of strategic situations where there is a strict order of play. • Perfect information implies that players know everything that has happened prior to making a decision. • Sequential move games are most easily represented in extensive form, that is, using a game tree. • The investment game we played in class was an example.
Constructing a sequential move game • Who are the players? • What are the action choices/strategies available to each player. • When does each player get to move? • How much do they stand to gain/lose? Example 1: The merger game. Suppose an industry has six large firms (think airlines). Denote the largest firm as firm 1 and the smallest firm as firm 6. Suppose firm 1 proposes a merger with firm 6 and in response, Firm 2 considers whether to merge with firm 5.
The Merger Game Tree Since Firm 1 moves Firm 1 first, they are placed Buy Don’t Buy at the root node of Firm 6 Firm 6 the game tree. Firm 2 Firm 2 Buy Don’t Buy Buy Don’t Buy Firm 5 Firm 5 Firm 5 Firm 5 1A, 2A 1B, 2B 1C, 2C 1D, 2D • What payoff values do you assign to firm 1’s payoffs 1A, 1B, 1C, 1D? To firm 2’s payoffs 2A, 2B, 2C, 2D? Think about the relative profitability of the two firms in the four possible outcomes, or terminal nodes of the tree. Use your economic intuition to rank the outcomes for each firm.
Assigning Payoffs Firm 1 Buy Don’t Buy Firm 6 Firm 6 Firm 2 Firm 2 Buy Don’t Buy Buy Don’t Buy Firm 5 Firm 5 Firm 5 Firm 5 1A, 2A 1B, 2B 1C, 2C 1D, 2D • Firm 1’s Ranking: 1B > 1A > 1D > 1C. Use 4, 3, 2, 1 • Firm 2’s Ranking: 2C > 2A > 2D > 2B. Use 4, 3, 2, 1
The Completed Game Tree Firm 1 Buy Don’t Buy Firm 6 Firm 6 Firm 2 Firm 2 Buy Don’t Buy Buy Don’t Buy Firm 5 Firm 5 Firm 5 Firm 5 3, 3 4, 1 1, 4 2, 2 • What is the equilibrium? Why?
Example 2: The Senate Race Game • Incumbent Senator Gray will run for reelection. The challenger is Congresswoman Green. • Senator Gray moves first, and must decide whether or not to run advertisements early on. • The challenger Green moves second and must decide whether or not to enter the race. • Issues to think about in modeling the game: – Players are Gray and Green. Gray moves first. – Strategies for Gray are Ads, No Ads; for Green: In or Out. – Ads are costly, so Gray would prefer not to run ads. – Green will find it easier to win if Gray does not run ads.
Computer Screen View
What are the strategies? •A pure strategy for a player is a complete plan of action that specifies the choice to be made at each decision node. • Gray has two pure strategies: Ads or No Ads. • Green has four pure strategies: 1. If Gray chooses Ads, choose In and if Gray chooses No Ads choose In. 2. If Gray chooses Ads, choose Out and if Gray chooses No Ads choose In. 3. If Gray chooses Ads, choose In and if Gray chooses No Ads choose Out. 4. If Gray chooses Ads, choose Out and if Gray chooses Ao Ads choose Out. • Summary: Gray’s pure strategies, Ads, No Ads. • Greens’ pure strategies: (In, In), (Out, In), (In, Out), (Out, Out).
Using Rollback or Backward Induction to find the Equilibrium of a Game • Suppose there are two players A and B. A moves first and B moves second. • Start at each of the terminal nodes of the game tree. What action will the last player to move, player B choose starting from the immediate prior decision node of the tree? • Compare the payoffs player B receives at the terminal nodes, and assume player B always chooses the action giving him the maximal payoff. • Place an arrow on these branches of the tree. Branches without arrows are “pruned” away. • Now treat the next-to-last decision node of the tree as the terminal node. Given player B’s choices, what action will player A choose? Again assume that player A always chooses the action giving her the maximal payoff. Place an arrow on these branches of the tree. • Continue rolling back in this same manner until you reach the root node of the tree. The path indicated by your arrows is the equilibrium path.
Illustration of Backward Induction in Senate Race Game: Green’s Best Response
Illustration of Backward Induction in Senate Race Game: Gray’s Best Response This is the equilibrium
Is There a First Mover Advantage? • Suppose the sequence of play in the Senate Race Game is changed so that Green gets to move first. The payoffs for the four possible outcomes are exactly the same as before, except now, Green’s payoff is listed first.
Whether there is a first mover advantage depends on the game. • To see if the order matters, rearrange the sequence of moves as in the senate race game. • Other examples in which order may matter: – Adoption of new technology. Better to be first or last? – Class presentation of a project. Better to be first or last? • Sometimes order does not matter. For example, is there a first mover advantage in the merger game as we have modeled it? Why or why not? • Is there such a thing as a second mover advantage? – Sometimes, for example: • Sequential biding by two contractors. • Cake-cutting: One person cuts, the other gets to decide how the two pieces are allocated.
Adding more players • Game becomes more complex. • Backward induction, rollback can still be used to determine the equilibrium. • Example: The merger game. There are 6 firms. – If firms 1 and 2 make offers to merge with firms 5 and 6, what should firm 3 do? – Make an offer to merge with firm 4? – Depends on the payoffs.
3 Player Merger Game Firm 1 Buy Don’t Buy Firm 1 Firm 1 Firm 2 Firm 2 Buy Don’t Buy Buy Don’t Buy Firm 2 Firm 2 Firm 2 Firm 2 Firm 3 Firm 3 Firm 3 Firm 3 Buy Don’t Buy Don’t Buy Don’t Buy Don’t Buy Firm 3 Firm 3 Buy Firm 3 Firm 3 Buy Buy Firm3 Firm 3 Firm 3 Firm 3 (2,2,2) (4,4,1) (4,1,4) (5,1,1) (1,4,4) (1,5,1) (1,1,5) (3,3,3)
Solving the 3 Player Game Firm 1 Buy Don’t Buy Firm 1 Firm 1 Firm 2 Firm 2 Buy Don’t Buy Buy Don’t Buy Firm 2 Firm 2 Firm 2 Firm 2 Firm 3 Firm 3 Firm 3 Firm 3 Buy Don’t Buy Don’t Buy Don’t Buy Don’t Buy Firm 3 Firm 3 Buy Firm 3 Firm 3 Buy Buy Firm3 Firm 3 Firm 3 Firm 3 (2,2,2) (4,4,1) (4,1,4) (5,1,1) (1,4,4) (1,5,1) (1,1,5) (3,3,3)
Adding More Moves • Again, the game becomes more complex. • Consider, as an illustration, the Game of Nim • Two players, move sequentially. • Initially there are two piles of matches with a certain number of matches in each pile. • Players take turns removing any number of matches from a single pile. • The winner is the player who removes the last match from either pile. • Suppose, for simplicity that there are 2 matches in the first pile and 1 match in the second pile. We will summarize the initial state of the piles as (2,1), and call the game Nim (2,1) • What does the game look like in extensive form?
Nim (2,1) in Extensive Form
How reasonable is rollback/backward induction as a behavioral principle? • May work to explain actual outcomes in simple games, with few players and moves. • More difficult to use in complex sequential move games such as Chess. – We can’t draw out the game tree because there are too many possible moves, estimated to be on the order of 10120. – Need a rule for assigning payoffs to non-terminal nodes – a intermediate valuation function. • May not always predict behavior if players are unduly concerned with “fair” behavior by other players and do not act so as to maximize their own payoff, e.g. they choose to punish “unfair” behavior.
“Nature” as a Player • Sometimes we allow for special type of player, - nature- to make random decisions. Why? • Often there are uncertainties that are inherent to the game, that do not arise from the behavior of other players. – e.g whether you can find a parking place or not. • A simple example: 1 player game against Nature. With probability ẵ Nature chooses G and with probability ẵ Nature chooses B. Nature GB Player Player lrl r 4321
Playing Against Nature • Nature does not receive a payoff. Only the real player(s) do. Nature GB Player Player lrl r 4321 • What is your strategy for playing this game if you are the player?
The Centipede Game • Experimental tests of backward induction have been conducted using this game. • How would you play this game?
Simultaneous Move Games • Arise when players have to make their strategy choices simultaneously, without knowing the strategies that have been chosen by the other player(s). – Student studies for a test; the teacher writes questions. – Two firms independently decide whether or not to develop and market a new product. • While there is no information about what other players will actually choose, we assume that the strategic choices available to each player are known by all players. • Players must think not only about their own best strategic choice but also the best strategic choice of the other player(s).
Normal or Strategic Form • A simultaneous move game is depicted in “Normal” or “Strategic” form using a game table that relates the strategic choices of the players to their payoffs. • The convention is that the row player’s payoff is listed first and the column player’s payoff is listed second. Column Player Strategy C1 Strategy C2 Row Strategy R1 a,b c,d Player Strategy R2 e,f g,h • For example, if Row player chooses R2 and Column player chooses C1, the Row player’s payoff is e and the Column player’s payoff is f.
Strategy Types: Pure versus Mixed • A player pursues a pure strategy if she always chooses the same strategic action out of all the strategic action choices available to her in every round – e.g. Always refuse to clean the apartment you share with your roommate. • A player pursues a mixed strategy if she randomizes in some manner among the strategic action choices available to her in every round. – e.g. Sometimes pitch a curveball, sometimes a slider (“mix it up,” “keep them guessing”). • We focus for now on pure strategies only.
Example: Battle of the Networks • Suppose there are just two television networks. • Both are battling for shares of viewers (0-100%). Higher shares are preferred (= higher advertising revenues). Network 1 has an advantage in sitcoms. If it runs a sitcom, it always gets a higher share than if it runs a game show. • Network 2 has an advantage in game shows. If it runs a game show it always gets a higher share than if it runs a sitcom. Network 2 Sitcom Game Show Sitcom 55%, 45% 52%, 48% Network 1 Game Show 50%, 50% 45%, 55%
Computer Screen View • Note that we have dropped % from the payoff numbers.
Nash Equilibrium • We cannot use rollback in a simultaneous move game, so how do we find a solution? • We determine the “best response” of each player to a particular choice of strategy by the other player. We do this for both players. • If each player’s strategy choice is a best response to the strategy choice of the other player, then we have found a solution or equilibrium to the game. • This solution concept is know as a Nash equilibrium, after John Nash who first proposed it. • A game may have 0, 1 or more Nash equilibria.
Cell-by-Cell Inspection Method • The cell-by-cell inspection method is the most reliable for finding Nash equilibria. First Find Network 1’s best response. • If Network 2 runs a sitcom, Network 1’s best response is to run a sitcom. • If Network 2 runs a game show, Network 1’s best response is to run a sitcom. Network 2 Sitcom Game Show Sitcom 55%, 45% 52%, 48% Network 1 Game Show 50%, 50% 45%, 55%
Cell-by-Cell Inspection Method, Continued • Next, we find Network 2’s best response. • If Network 1 runs a sitcom, Network 2’s best response is to run a game show. • If Network 1 runs a game show, Network 2’s best response is to run a game show. • The unique Nash equilibrium is for Network 1 to run a sitcom and Network 2 to run a game show. Network 2 Nash Equilibrium Sitcom Game Show is the intersection of the two Sitcom 55%, 45% 52%, 48% players’ Network 1 strategic Game Show 50%, 50% 45%, 55% best responses.
Dominant Strategies • A player has a dominant strategy if it outperforms (has higher payoff than) all other strategies regardless of the strategies chosen by the opposing player(s). • For example, in the battle of the networks game, Network 1 has a dominant strategy of always choosing to run a sitcom. Network 2 has a dominant strategy of always choosing to run a game show. •Why? • Elimination of non-dominant or “dominated” strategies can help us to find a Nash equilibrium.
Successive Elimination of Dominated Strategies • Another way to find Nash equilibria. • Draw lines through (successively eliminate) each player’s dominated strategy(s). • If successive elimination of dominated strategies results in a unique outcome, that outcome is the Nash equilibrium of the game. • We call such games dominance solveable Network 2 Sitcom Game Show Sitcom 55%, 45% 52%, 48% Network 1 Game Show 50%, 50% 45%, 55%
Adding More Strategies • Suppose we add the choice of a “reality TV” show. Network 2 Sitcom Game Show Reality TV Sitcom 55, 45 52, 48 51, 49 Network 1 Game Show 50, 50 45, 55 46, 54 Reality TV 52, 48 49, 51 48, 52 • What is the Nash equilibrium in this case? First ask: are there any dominated strategies? If so, eliminate them from consideration.
Eliminating the Dominated Strategies Reduces the Set of Strategies that May Comprise Nash Equilibria. Network 2 Sitcom Game Show Reality TV Sitcom 55, 45 52, 48 51, 49 Network 1 Game Show 50, 50 45, 55 46, 54 Reality TV 52, 48 49, 51 48, 52 • A game show is a dominated strategy for Network 1. A sitcom is a dominated strategy for Network 2.
After eliminating dominated strategies, continue the search for dominated strategies among the remaining choices. Network 2 Sitcom Game Show Reality TV Sitcom 55, 45 52, 48 51, 49 Network 1 Game Show 50, 50 45, 55 46, 54 Reality TV 52, 48 49, 51 48, 52 • Reality TV is now a dominated strategy for Network 1 • Game show is now a dominated strategy for Network 2.
Cell-by-Cell Inspection Also Works Network 2 Sitcom Game Show Reality TV Sitcom 55, 45 52, 48 51, 49 Network 1 Game Show 50, 50 45, 55 46, 54 Reality TV 52, 48 49, 51 48, 52
Non-Constant-Sum Games • The Network Game is an example of a constant sum game. The payoffs to both players always add up to the constant sum of 100%. • We could make that game zero sum by redefining payoffs relative to a 50%-50% share for each network. • Nash equilibria also exist in non-constant sum or variable sum games, where players may have some common interest. • For example, prisoner’s dilemma type games. Burger King Payoffs are Daily Profits per Both Store in thousands of dollars. No Value Value Meals firms Meals have a dominant No Value 3, 3 1, 4 strategy: McDonald’s Meals Value Value Meals 4, 1 2, 2 Meals
Cournot Competition • A game where two firms compete in terms of the quantity sold (market share) of a homogeneous good is referred to as a Cournot game after the French economist who first studied it. •Let q1 and q2 be the number of units of the good that are brought to market by firm 1 and firm 2. Assume the market price, P is determined by market demand: P=a-b(q1+q2) if a>b(q1+q2), P=0 otherwise. P a slope = -b q1+q2 • Firm 1’s profits are (P-c)q1 and firm 2’s profits are (P-c)q2, where c is the marginal cost of producing each unit of the good. • Assume both firms seek to maximize profits.
Numerical Example • Suppose P = 130-(q1+q2), so a=130, b=1 • The marginal cost per unit, c=$10 for both firms. • Suppose there are just three possible quantities that each firm i=1,2 can choose qi = 30, 40 or 60. • There are 3x3=9 possible profit outcomes for the two firms. • For example, if firm 1 chooses q1=30, and firm 2 chooses q2=60, then P=130-(30+60)=$40. • Firm 1’s profit is then (P-c)q1=($40-$10)30=$900. • Firm 2’s profit is then (P-c)q2=($40-$10)60=$1800.
Cournot Game Payoff Matrix Firm 2 Firm 1 q2=30 q2=40 q2=60 q1=30 1800, 1800 1500, 2000 900, 1800 q1=40 2000, 1500 1600, 1600 800, 1200 q1=60 1800, 900 1200, 800 0, 0 • Depicts all 9 possible profit outcomes for each firm.
Find the Nash Equilibrium Firm 2 Firm 1 q2=30 q2=40 q2=60 q1=30 1800, 1800 1500, 2000 900, 1800 Nash Eq. q1=40 2000, 1500 1600, 1600 800, 1200 q1=60 1800, 900 1200, 800 0, 0 • q=60 is weakly dominated for both firms; use cell-by-cell inspection to complete the search for the equilibrium.
Continuous Pure Strategies • In many instances, the pure strategies available to players do not consist of just 2 or 3 choices but instead consist of (infinitely) many possibilities • We handle these situations by finding each player’s reaction function, a continuous function revealing the action the player will choose as a function of the action chosen by the other player. • For illustration purposes, let us consider again the two firm Cournot quantity competition game. • Duopoloy (two firms only) competition leads to an outcome in between monopoly (1 firm, maximum possible profit) and perfect competition (many firms, each earning 0 profits, p=c)
Profit Maximization with Continuous Strategies • Firm 1’s profit π1=(P-c)q1=(a-b(q1+q2)-c)q1=(a-bq2-c)q1- 2 b(q1) . • Firm 2’s profit π2=(P-c)q2=(a-b(q1+q2)-c)q2 =(a-bq1-c)q2- 2 b(q2) . • Both firms seek to maximize profits. We find the profit maximizing amount using calculus: • Firm 1: d π1/dq1=a-bq2-c-2bq1. At a maximum, d π1/dq1=0, q1=(a-bq2-c)/2b. This is firm 1’s best response function. • Firm 1: d π2/dq2=a-bq1-c-2bq2. At a maximum, d π2/dq2=0, q2=(a-bq1-c)/2b. This is firm 2’s best response function. • In our numerical example, firm 1’s best response function is q1=(a-bq2-c)/2b = (130-q2-10)/2=60-q2/2. • Similarly, firm 2’s best response function in our example is: •q2=(a-bq1-c)/2b = (130-q1-10)/2=60-q1/2.
Equilibrium with Continuous Strategies • Equilibrium can be found algebraically or graphically. • Algebraically: q1=60-q2/2 and q2=60-q1/2, so substitute out using one of these equations: q1=60-(60-q1/2)/2 =60-30+q1/4, so q1(1-1/4)=30, q1=30/.75=40. Similarly, you can show that q2=40 as well (the problem is perfectly symmetric). • Graphically: q1 In this case, the equilibrium 120 in the continuous strategy space 60-q1/2 (40, 40) is the same as in the discrete (3 choice) action space. This is not always the case. 40 60-q2/2 40 120 q2
Probability, Expected Payoffs and Expected Utility • In thinking about mixed strategies, we will need to make use of probabilities. We will therefore review the basic rules of probability and then derive the notion of expected value. • We will also develop the notion of expected utility as an alternative to expected payoffs. • Probabilistic analysis arises when we face uncertainty. • In situations where events are uncertain, a probability measures the likelihood that a particular event (or set of events) occurs. – e.g. The probability that a roll of a die comes up 6. – The probability that two randomly chosen cards add up to 21 (Blackjack).
Sample Space or Universe • Let S denote a set (collection or listing) of all possible states of the environment known as the sample space or universe; a typical state is denoted as s. For example: •S={s1, s2}; success/failure, or low/high price. •S={s1, s2, ,sn-1,sn}; number of n units sold or n offers received. •S=[0, ả ); stock price or salary offer. (continuous positive set space).
Events • An event is a collection of those states s that result in the occurrence of the event. • An event can be that state s occurs or that multiple states occur, or that one of several states occurs (there are other possibilities). • Event A is a subset of S, denoted as A ⊂ S. • Event A occurs if the true state s is an element of the set A, written as s∈A.
Venn Diagrams • Illustrates the sample space and events. S A1 A2 • S is the sample space and A1 and A2 are events within S. c •“Event A1 does not occur.” Denoted A1 (Complement of A1) •“Event A1 or A2 occurs.” Denoted A1 ằ A2 (For probability use Addition Rules) •“Event A1 and A2 both occur”, denoted A1 A2 (For probability use Multiplication Rules).
Probability • To each uncertain event A, or set of events, e.g. A1 or A2, we would like to assign weights which measure the likelihood or importance of the events in a proportionate manner. • Let P(Ai) be the probability of Ai. • We further assume that: A i = S Uall i P ( A i ) = 1 Uall i P ( A i ) ≥ 0 .
Addition Rules • The probability of event A or event B: P(AằB) • If the events do not overlap, i.e. the events are disjoint subsets of S, so that A B=ô, then the probability of A or B is simply the sum of the two probabilities. P(AằB) = P(A) + P(B). • If the events overlap, (are not disjoint) A B ∫ô use the modified addition rule: P(AằB) = P(A) + P(B) – P(A B)
Example Using the Addition Rule • Suppose you throw two dice. There are 6x6=36 possible ways in which both can land. • Event A: What is the probability that both dice show the same number? A={{1,1}, {2,2}, {3,3}, {4,4}, {5,5}, {6,6}} so P(A)=6/36 • Event B: What is the probability that the two die add up to eight? B={{2,6}, {3,5}, {4,4}, {5,3}, {6,2}} so P(B)=5/36. • Event C: What is the probability that A or B happens, i.e. P(AằB)? First, note that A B = {{4,4}} so P(A B) =1/36. P(A ằB) = P(A)+P(B)-P(A B) = 6/36+5/36-1/36=10/36 (5/18).
Multiplication Rules • The probability of event A and event B: P(A B) • Multiplication rule applies if A and B are independent events. • A and B are independent events if P(A) does not depend on whether B occurs or not, and P(B) does not depend on whether A occurs or not. P(A B)= P(A)àP(B)=P(AB) • Conditional probability for non independent events. The probability of A given that B has occurred is P(A|B)= P(AB)/P(B).
Examples Using Multiplication Rules • An unbiased coin is flipped 5 times. What is the probability of the sequence: TTTTT? P(T)=.5, 5 independent flips, so .5x.5x.5x.5x.5=.03125. • Suppose a card is drawn from a standard 52 card deck. Let B be the event: the card is a queen: P(B)=4/52. Event A: Conditional on Event B, what is the probability that the card is the Queen of Hearts? First note that P(AB)=P(A B)= 1/52. (Probability the Card is the Queen of Hearts) P(A|B)=P(AB)/P(B) = (1/52)/(4/52)=1/4.
Bayes Rule • Used for making inferences: given a particular outcome, event A, can we infer the unobserved cause of that outcome, some event B1, B2, Bn? • Suppose we know the prior probabilities, P(Bi) and the conditional probabilities P(A|Bi) • Suppose that B1, B2, ,Bn form a complete partition of the sample space S, so that ằ i Bi = S and Bi Bj=ô for any i ∫ j. In this case we have that: n P ( A ) = ∑ P [ A | B i ] P [ B i ] (1) i = 1 • Bayes rule is a formula for computing the posterior probabilities e.g. the probability that event Bk was the cause of outcome A, denoted P(Bk|A): P ( B k | A ) = P ( B k ∩ A ) / P ( A ) Using the conditional = P ( A | B k ) P ( B k ) / P ( A ) probability rule P ( A | B k ) P ( B k ) This is Using expression (1) = n above. Bayes Rule ∑ P ( A | B i ) P ( B i ) i = 1
Bayes Rule-Special Case • Suppose S consists of just B and not B, i.e. Bc • Then Bayes rule can be stated as: P(A | B)P(B) P(B | A) = P(A | B)P(B) + P(A | Bc )P(Bc ) • Example: Suppose a drug test is 95% effective: the test will be positive on a drug user 95% of the time, and will be negative on a non-drug user 95% of the time. Assume 5% of the population are drug users. Suppose an individual tests positive. What is the probability he is a drug user?
Bayes Rule Example • Let A be the event that the individual tests positive. Let B be the event individual is a drug user. Let Bc be the complementary event, that the individual is not a drug user. Find P(B|A). • P(A|B)=.95. P(A|Bc)=.05, P(B)=.05, P(Bc)=.95 P(A | B)P(B) P(B | A) = P(A | B)P(B) + P(A | Bc )P(Bc ) (.95)(.05) = = .50 (.95)(.05) + (.05)(.95)
Monty Hall’s 3 Door Problem • There are three closed doors. Behind one of the doors is a brand new sports car. Behind each of the other two doors is a smelly goat. You can’t see the car or smell the goats. • You win the prize behind the door you choose. • The sequence of play of the game is as follows: – You choose a door and announce your choice. – The host, Monty Hall, who knows where the car is, always selects one of the two doors that you did not choose, which he knows has a goat behind it. – Monty then asks if you want to switch your choice to the unopened door that you did not choose. • Should you switch? 1 2 3
You Should Always Switch •Let Ci be the event “car is behind door i” and let G be the event: “Monty chooses a door with a goat behind it.” • Suppose without loss of generality, the contestant chooses door 1. Then Monty shows a goat behind door number 3 • According to the rules, P(G)=1, and so P(G|C1)=1; • Initially, P(C1)=P(C2)=P(C3)=1/3. By the addition rule, we also know that P(C2 ằ C3)=2/3. • After Monty’s move, P(C3)=0. P(C1) remains 1/3, but P(C2) now becomes 2/3! • According to Bayes Rule: P(G | C )P(C ) 1ì1/ 3 P(C | G) = 1 1 = =1/ 3. 1 P(G) 1 • It follows that P(C2|G)=2/3, so the contestant always does better by switching; the probability is 2/3 he wins the car.
Here is Another Proof • Let (w,x,y,z) describe the game. – w=your initial door choice, x=the door Monty opens, y=the door you finally decide upon and z=W/L (whether you win or lose). – Without loss of generality, assume the car is behind door number 1, and that there are goats behind door numbers 2 and 3. – Suppose you adopt the never switch strategy. The sample space under this strategy is: S=[(1,2,1,W),(1,3,1,W),(2,3,2,L),(3,2,3,L)]. – If you choose door 2 or 3 you always lose with this strategy. – But, if you initially choose one of the three doors randomly, it must be that the outcome (2,3,2,L) and (3,2,3,L) each occur with probability 1/3. That means the two outcomes (1,2,1,W) and (1,3,1,W) have the remaining 1/3 probability ﬂ you win with probability 1/3. – Suppose you adopt the always switch strategy. The sample space under this strategy is: S=[(1,2,3,L),(1,3,2,L),(2,3,1,W),(3,2,1,W)]. – Since you initially choose door 2 with probability 1/3 and door 3 with probability 1/3, the probability you win with the switching strategy is 1/3+1/3=2/3 ﬂ you should always switch.
Expected Value (or Payoff) • One use of probabilities to calculate expected values (or payoffs) for uncertain outcomes. • Suppose that an outcome, e.g. a money payoff is uncertain. There are n possible values, X1, X2, ,XN. Moreover, we know the probability of obtaining each value. •The expected value (or expected payoff) of the uncertain outcome is then given by: P(X1)X1+P(X2)X2+ P(XN)XN.
An Example • You are made the following proposal: You pay $3 for the right to roll a die once. You then roll the die and are paid the number of dollars shown on the die. Should you accept the proposal? •The expected payoff of the uncertain die throw is: 1 1 1 1 1 1 ì$1+ ì$2+ ì$3+ ì$4+ ì$5+ ì$6=$3.50 6 6 6 6 6 6 • The expected payoff from the die throw is greater than the $3 price, so a (risk neutral) player accepts the proposal.
Extensive Form Illustration: Nature as a Player • Payoffs are in net terms: $3 – winnings. 1424444 434444 0.5
Accounting for Risk Aversion • The assumption that individuals treat expected payoffs the same as certain payoffs (i.e. that they are risk neutral) may not hold in practice. • Recall our earlier examples: – A risk neutral person is indifferent between $25 for certain or a 25% chance of earning $100 and a 75% chance of earning 0. – A risk neutral person agrees to pay $3 to roll a die once and receive as payment the number of dollars shown on the die. • Many people are risk averse and prefer $25 with certainty to the uncertain gamble, or might be unwilling to pay $3 for the right to roll the die once, so imagining that people base their decisions on expected payoffs alone may yield misleading results. • What can we do to account for the fact that many people are risk averse? We can use the concept of expected utility.
Utility Function Transformation •Let x be the payoff amount in dollars, and let U(x) be a continuous, increasing function of x. • The function U(x) gives an individual’s level of satisfaction in fictional “utils” from receiving payoff amount x, and is known as a utility function. • If the certain payoff of $25 is preferred to the gamble, (due to risk aversion) then we want a utility function that satisfies: U($25) > .25 U($100) +.75 U($0). • The left hand side is the utility of the certain payoff and the right hand side is the expected utility from the gamble. • In this case, any concave function U(x) will work, e.g. U ( X ) = X 25 > .25 100 + .75 0 , ⇔ 5 > 2 .5
Graphical Illustration • The blue line shows the utility of any certain monetary payoff between $0 and $100, assuming U ( X ) = X Utility Function Transformation Illustrated 12 10 8 6 4 Utility Level 2 0 0 255075100 Do llars • Utility diminishes with increases in monetary payoff – this is just the principle of diminishing marginal utility (requires risk aversion). • Black (dashed) line shows the expected utility of risky payoff • At $25, the certain payoff yields higher utility than the risky payoff.
Another Example • If keeping $3 were preferred to rolling a die and getting paid the number of dollars that turns up (expected payoff $3.5) we need a utility function that satisfied: 1 1 1 1 1 1 U ($3) > U ($1) + U ($2) + U ($3) + U ($4) + U ($5) + U ($6) 6 6 6 6 6 6 • In this case, where the expected payoff $3.5 is strictly higher than the certain amount – the $3 price – the utility function must be sufficiently concave for the above relation to hold. – If we used U ( x ) = x = x 1 / 2, we would find that the left-hand-side of the expression above was 3 = 1 . 732 , while the right-hand-side equals 1.805, so we need a more concave function. • We would need a utility function transformation of U ( x ) = x 1 / 100 for the inequality above to hold, (50 times more risk aversion)!
Summing up • The notions of probability and expected payoff are frequently encountered in game theory. • We mainly assume that players are risk neutral, so they seek to maximized expected payoff. • We are aware that expected monetary payoff might not be the relevant consideration – that aversion to risk may play a role. • We have seen how to transform the objective from payoff to utility maximization so as to capture the possibility of risk aversion – the trick is to assume some concave utility function transformation. • Now that we know how to deal with risk aversion, we are going to largely ignore it, and assume risk neutral behavior ☺
Mixed Strategy Nash Equilibrium • A mixed strategy is one in which a player plays his available pure strategies with certain probabilities. • Mixed strategies are best understood in the context of repeated games, where each player’s aim is to keep the other player(s) guessing, for example: Rock, Scissors Paper. • If each player in an n-player game has a finite number of pure strategies, then there exists at least one equilibrium in (possibly) mixed strategies. (Nash proved this). • If there are no pure strategy equilibria, there must be a unique mixed strategy equilibrium. • However, it is possible for pure strategy and mixed strategy Nash equilibria to coexist, as in the Chicken game.
Example 1: Tennis • Let p be the probability that Serena chooses DL, so 1-p is the probability that she chooses CC. • Let q be the probability that Venus positions herself for DL, so 1-q is the probability that she positions herself for CC. • To find mixed strategies, we add the p-mix and q-mix options. Venus Williams DL CC q-mix DL 50, 50 80, 20 50q+80(1-q), 50q+20(1-q) Serena 90, 10 20, 80 90q+20(1-q), Williams CC 10q+80(1-q) p-mix 50p+90(1-p), 80p+20(1-p), 50p+10(1-p) 20p+80(1-p)
Row Player’s Optimal Choice of p • Choose p so as to equalize the payoff your opponent receives from playing either pure strategy. • This requires understanding how your opponent’s payoff varies with your choice of p. Graphically, in our example: Venus's Success Rate from positioning for DL or CC For Serena’s against Serena's p-mix choice of p, Venus’s expected 100 payoff: from 80 playing DL is: 60 DL Venus 50p+10(1-p) 40 CC is made from playing CC 20 indifferent is: 0 Venus's Success % if Serena 20p+80(1-p) 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 chooses Serena's p-mix p=.70
Algebraically: • Serena solves for the value of p that equates Venus’s payoff from positioning herself for DL or CC: 50p+10(1-p) = 20p+80(1-p), or 50p+10-10p = 20p+80-80p, or 40p+10 = 80-60p, or 100p = 70, so p = 70/100 = .70. • If Serena plays DL with probability .70 and CC with probability 1-p=.30, then Venus’s success rate from DL=50(.70)+10(.30)=38%=Venus’s success rate from CC=20(.70)+80(.30)=38%. • Since this is a constant sum game, Serena’s success rate is 100%-Venus’s success rate = 100-38=62%.
Column Player’s Optimal Choice of q • Choose q so as to equalize the payoff your opponent receives from playing either pure strategy. • This requires understanding how your opponent’s payoff varies with your choice of q. Graphically, in our example: Serena's Success Rate from playing DL or CC For Venus’s against Venus's q-mix choice of q, Serena’s 100 payoff: from 80 playing DL 60 DL Serena is: 40 CC is made 50q+80(1-q) 20 indifferent from playing Serena's Success % 0 if Venus CC is: 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 chooses 90q+20(1-q) Venus's q-mix q=.60
Algebraically • Venus solves for the value of q that equates Serena’s payoff from playing DL or CC: 50q+80(1-q) = 90q+20(1-q), or 50q+80-80q = 90q+20-20q, or 80-30q = 70q+20, or 60 = 100q, so q = 60/100 = .60. • If Venus positions herself for DL with probability .60 and CC with probability 1-p=.40, then Serena’s success rate from DL=50(.60)+80(.40)=62%=Serena’s success rate from CC=90(.60)+20(.40)=62%. • Since this is a constant sum game, Venus’s success rate is 100%-Serena’s success rate = 100-62=38%.
The Mixed Strategy Equilibrium • A strictly mixed strategy Nash equilibrium in a 2 player, 2 choice (2x2) game is a p > 0 and a q > 0 such that p is a best response by the row player to column player’s choices, and q is a best response by the column player to the row player’s choices. • In our example, p=.70, q=.60. The row player’s payoff (Serena) was 62 and the column player’s payoff (Venus) was 38. (Serena wins 62%, Venus 38%). • Pure strategies can now be understood as special cases of mixed strategies, where p is chosen from the set {0, 1} and q is chosen from the set {0, 1}. For example, if p=0 and q=1, then row player always plays CC and column player always plays DL.
Keeping the Opponent Indifferent • Why is this a useful objective for determining the mixing probability? – In constant sum games, such as the tennis example, making your opponent indifferent in expected payoff terms is equivalent to minimizing your opponents’ ability to recognize and exploit systematic patterns of behavior in your own choice. – In constant sum games, keeping your opponent indifferent is equivalent to keeping yourself indifferent – The same objective works for finding mixed strategy equilibria in non-constant sum games as well, where players interests are not totally opposed to one another. • Necessarily suggests that the game is played repeatedly.
Best Response Functions • Another way to depict each player’s choice of the mixing probability. (Recall p=P(DL) by Serena, q=P(DL) by Venus). • Shows strategic best response of q=f(p) and p=g(q). • p, q =0 is always play CC, p, q =1 is always play DL. Venus’s Best Serena’s Best Response Function Response Function 1 1 q p 0 0 0 .70 1 0 .60 1 p q
Construction of Best Response Functions • Use the graphs of the optimal choices of q=f(p) and p=g(q): Venus's Success Rate from positioning for DL or CC Serena's Success Rate from playing DL or CC against Serena's p-mix against Venus's q-mix % 100 100 80 80 60 DL 60 DL 40 CC 40 CC 20 20 0 Serena's Success % 0 Venus's Success 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Serena's p-mix Ve nus's q-mix Venus’s Best Serena’s Best Response Function Response Function 1 1 q p 0 0 0 .70 1 0 .60 1 p q
Combining the Best Response Functions Reveals the Mixed Strategy Nash Equilibrium 1 Mixed Strategy .60 Nash Equilibrium q Occurs at p=.70, Equilibria, pure or q=.60. mixed, obtain wherever the best response functions 0 intersect. 0 .70 1 p • Note that there is no equilibrium in pure strategies in this game, i.e. p=q=0 or p=q=1.
Example 2: Market Entry Game • Two Firms, MD and BK must decide whether to put one of their restaurants in a shopping mall. • The strategies are to “Enter” or “Don’t Enter”. • If either firm plays Don’t Enter, it earns 0 profits. • If one firm plays Enter and the other plays Don’t Enter, the Firm that plays Enter earns $300,000 in profits (Don’t enter always yields 0 profits). • If both firms choose to play Enter, both lose $100,000 since there is not enough demand for two restaurants to make positive profits.
The Market Entry Game is Non-Zero Sum Payoffs are in $100,000 dollar amounts DWhat are the Nash equilibria in this game?
Cell-by-Cell Inspection Reveals 2 Pure Strategy Equilibria Payoffs are in $100,000 dollar amounts • BUT are there any mixed strategy equilibria? • To find out, we can check whether there are mixed strategy equilibria.
Calculation of the Mixed Strategy Probabilities • MD (BK) wants to choose p (q) to make BK (MD) indifferent between playing Enter or Don’t Enter: • MD: Choose p such that: BK’s payoff from Enter: • p(-1)+(1-p)(3)=p(0)+(1-p)(0)=0, BK’s payoff from Don’t Enter. So -p+3-3p=0, or 3=4p, so p=3/4. • BK: Choose q such that: MD’s payoff from Enter: • q(-1)+(1-q)3=q(0)+(1-q)(0)=0, MD’s payoff from Don’t Enter. So –q+3-3q=0, or 3=4q, so q=3/4. • In the mixed strategy Nash equilibrium both firms choose enter with probability ắ, and Don’t Enter with probability ẳ.
Best Response Functions for the Market Entry Game BK’s MD’s expected expected payoff payoff 3 Payoff from Enter: 3 Payoff from Enter: 3-4p 3-4p Payoff from Payoff from Don’t Don’t The combined Best Enter: 0 Enter: 0 Response Functions show 0 0 both pure strategy equilibria (PSE) and the ắ 1 1 ắ mixed strategy -1 -1 MD’s choice of p BK’s choice of q equilibrium (MSE) BK’s Best Response Function MD’sBest Response Function Combined BR Functions 1 1 1 PSE # 1 ắ MSE q p q 0 0 0 PSE # 2 0 0 p ắ 1 0 q ắ 1 p ắ 1
Economics of the Mixed Strategy Equilibrium • Both firms choose to play Enter with probability ắ. • Expected payoffs from Enter = 3-4(ắ)=0 which is the same expected payoff from Don’t Enter (always 0). • Probability both Enter (multiplication rule) (ắ) (ắ)= 9/16, (about half the time). • Probability MD enters and BK does not (ắ)(ẳ)=3/16. • Probability MD does not enter and BK does (ẳ)(ắ)=3/16. • Probability that neither firm Enters:(ẳ) (ẳ)=1/16. 9/16+3/16+3/16+1/16=1.0. • Payoff calculation for MD (same for BK): (9/16)(-1)+(3/16)(3)+3/16(0)+(1/16)(0)=0.
Asymmetric Mixed Strategy Equilibrium • Suppose we change the payoff matrix so that MD has a competitive advantage over BK in one situation: if MD is the sole entrant, it earns profits of 400,000. • Otherwise the game is the same as before. BK ENTER DON’T ENTER ENTER -1, -1 4, 0 MD DON’T 0, 3 0, 0 ENTER • The pure strategy equilibria remain the same, (E,D) or (D, E). What happens to the mixed strategy probability?
Changes in the Mixed Strategy Probability? • Since BK’s payoffs have not changed, MD’s mixture probability choice for p does not change. It chooses p so that –1(p)+3(1-p)=0, or 3=4p, p=3/4. • However, since MD’s payoffs have changed, BK’s mixture probability choice for q must change. • BK chooses q so as to make BK just indifferent between Entering and earning (-1)q+4(1-q) and not entering and earning 0: -q+4-4q=0, or 4=5q, so q=4/5 • Note that the probability that BK enters goes up from ắ to 4/5. If it did not, then MD would choose p=1. Why? • MD’s expected payoff remains zero: -1(4/5)+4(1/5)=0.
The Market Entry Game with More Players • Market entry games frequently involve more than 2 firms (players). • Let there be N>2 firms. • Let m be the number of the N firms choosing to Enter. (N-m choose Don’t Enter). • Suppose the payoff to each firm is given by: 10 if choose X (Don' t Enter) Payoff Earned by a Firm =  10 + 2(7 − m ) if choose Y (Enter)
Equilibria in the N-player Market Entry Game • Pure strategy equilibrium: Each firm either plays Enter or Don’t Enter. • Firms compare the payoff from playing Don’t Enter, with the payoff from playing Enter: • When Does Don’t Enter yield the same payoff as Enter? 10=10+2(7-m) •Only when m=7. So the pure strategy equilibrium is for 7/N firms to play Enter, and (N-7)/N firms to play Don’t Enter.
Mixed Strategy Equilibria in the N-Player Market Entry Game • There is also a symmetric mixed strategy equilibrium in this game, where every firm mixes between Enter and Don’t Enter with the same probability. • Let p be the probability that each firm plays Enter. • In the mixed strategy equilibrium, 7 − 1 6 p = = N − 1 N − 1 • All players earn an expected payoff of 10.
Calculation of the Symmetric Mixed Strategy Probability • The entry payoff is 10+2(7-m), and the no entry payoff is 10. • First, for simplicity, subtract 10 from both payoffs. • Let p be individual i’s probability of entry. Assume all i players have the same p (mixed strategy is symmetric). • p[2(7-m)]=(1-p)(0)=0. • recall m is the total number of entrants including player i. • m=p(N-1)+1, the expected number of entrants among the N-1 other players, +1 if player i enters. • So p[2(7-[p(N-1)+1])]=0, or p[2(7-1)-2p(N-1)]=0, or 2p(7-1)=2p2(N-1), or (7-1)=p(N-1), or p=(7-1)/(N-1).
Sequential and Simultaneous Move Games • So far we have studied two types of games: 1) sequential move (extensive form) games where players take turns choosing actions and 2) strategic form (normal form) games where players simultaneously choose their actions. • Can a sequential move game be represented in strategic form and can a strategic form game be represented in extensive form? The answer is yes, but the rules of the game must change. • A second issue that we will examine are games that combine both sequential and simultaneous moves, as for example happens in the game of football.
From Sequential to Simultaneous Moves • It is easy to translate the information contained in an extensive form game to that contained in the strategic form game matrix. • Illustration 1. The Senate Race Game. Extensive Form Game information Green in Payoff In, In In, Out Out, In Out, Out Table Form Ads 1,1 1,1 3,3 3,3 Gray No 2,4 4,2 2,4 4,2 Ads
Information Sets • To actually convert a sequential move game into a simultaneous move game requires changing the rules so that no player can observe actions chosen by other players prior to their own choice of action. • We do this conversion by adding a device called an information set to the extensive-form game tree. • An information set is a set of game tree nodes at which the same player makes a choice, but this player does not know which node in the information set is the one to which play of the game has evolved. • Represented by a dashed line connecting the nodes or a oval around the nodes in the information set.
Conversion of the Senate Race Game to Strategic Form The extensive form game with this information set is equivalent to the simultaneous move game shown below in strategic form Is the equilibrium in the simultaneous move game different than in the sequential move game?
The Equilibrium Outcomes are Different! Why? Because in the Extensive Form Game Gray has a first-mover advantage: precommit to Ads. In the simultaneous move game, Gray cannot precommit to running Ads.
Example 2: The Investment Game Sequential Conversion Move Version to Simultaneous Move Version Information Set added Is the equilibrium Resulting in the Simultaneous simultaneous Move Game move game in Strategic different than Form. in the sequential move game?
The Equilibrium Outcomes are the Same! • Whether the equilibrium outcomes are the same or different in an sequential move game versus a simultaneous move game will depend on the credibility of the first mover’s action choice in the sequential move game. • If it is not credible, then the equilibrium outcome of the game may well be different when cast as a simultaneous move game.
Combining Sequential and Simultaneous Moves. • Consider the following 2 player game, where Player 1 moves first: Player 1 Stay Out Enter Player 2 A B 3,3 Player 1 A 2,2 3,0 B 0,3 4,4 • If player 1 chooses to stay out, both he and player 2 earn a payoff of 3 each, but if player 1 chooses to enter, he plays a simultaneous move game with player 2.
Forward Induction • The simultaneous move game has 3 equilibria: (A,A), (B,B) and a mixed strategy equilibrium where both players play A with probability 1/3 and earn expected payoff 8/3. Player 1 Stay Out Enter Player 2 A B 3,3 Player 1 A 2,2 3,0 B 0,3 4,4 • If player 2 sees that player 1 has chosen to Enter, player 2 can use forward induction reasoning: since player 1 chose to forego a payoff of 3, it is likely that he will choose B, so I should also choose B. • The likely equilibrium of the game is therefore: Enter, (B,B).
The Incumbent-Rival Game in Extensive and Strategic Form How many equilibria are How many equilibria are there in the extensive form of there in the strategic form of this this game? game?
The Number of Equilibria Appears to be Different! There appears to be just 1 There appears to be 2 equilibria equilibrium using rollback on the using cell-by-cell inspection of extensive form game. the strategic form game
Subgame Perfection • In the strategic form game, there is the additional equilibrium, Stay Out, Fight that is not an equilibrium using rollback in the extensive form game. • Equilibria found by applying rollback to the extensive form game are referred to as subgame perfect equilibria: every player makes a perfect best response at every subgame of the tree. – Enter, Accommodate is a subgame perfect equilibrium. – Stay Out, Fight is not a subgame perfect equilibrium. • A subgame is the game that begins at any node of the decision tree. 3 subgames (circled) are all the games beginning at all tree nodes including the root node (game itself)
Imperfect Strategies are Incredible • Strategies and equilibria that fail the test of subgame perfection are called imperfect. • The imperfection of a strategy that is part of an imperfect equilibrium is that at some point in the game it has an unavoidable credibility problem. • Consider for example, the equilibrium where the incumbent promises to fight, so the rival chooses stay out. • The incumbent’s promise is incredible; the rival knows that if he enters, the incumbent is sure to accommodate, since if the incumbent adheres to his promise to fight, both earn zero, while if the incumbent accommodates, both earn a payoff of 2. • Thus, Stay Out, Fight is a Nash equilibrium, but it is not a subgame perfect Nash equilibrium.
Another Example: Mutually Assured Destruction (MAD) What is the rollback (subgame perfect) equilibrium to this game? A subgame Another subgame Subgames must contain all nodes in an information set
The Strategic Form Game Admits 3 Nash Equilibria. • Which Equilibrium is Subgame Perfect? • Only the equilibrium where the strategies Escalate, Back Down are played by both the U.S. and Russia is subgame perfect. – Why?
From Simultaneous to Sequential Moves • Conversion from simultaneous to sequential moves involves determining who moves first, which is not an issue in the simultaneous move game. • In some games, where both players have dominant strategies, it does not matter who moves first. – For example the prisoner’s dilemma game. • When neither player has a dominant strategy, the subgame perfect equilibrium will depend on the order in which players move. – For example, the Senate Race Game, the Pittsburgh Left-Turn Game.
The Equilibrium in Prisoner’s Dilemma is the Same Regardless of Who Moves First This simultaneous move game is equivalent to either of the 2 sequential move games below it.
The Senate Race Game has a Different Subgame Perfect Equilibrium Depending on Who moves first. subgame perfect eq, subgame perfect eq,
Similarly, in the Pittsburgh Left-Turn Game Driver 1 Driver 2 Driver 2 Driver 1 Proceed Yield Proceed Yield Driver 1 Driver 1 Driver 2 Driver 2 Proceed Yield Proceed Yield Proceed Yield Proceed Yield -1490, -1490 5, -5 -5, 5 -10, -10 -1490, -1490 5, -5 -5, 5 -10, -10 These subgame perfect equilibria look the same, but if Driver 1 moves first he gets a payoff of 5, while if Driver 2 moves first Driver 1 gets a payoff of –5, and vice versa for Driver 2.
Going from a Simultaneous Move to a Sequential Move Game may eliminate the play of a mixed strategy equilibrium • This is true in games with a unique mixed strategy Nash equilibrium. – Example: The Tennis Game Venus Williams DL CC Serena DL 50, 50 80, 20 Williams CC 90, 10 20, 80
The Pure Strategy Equilibrium is Different, Depending on Who Moves First. There is no possibility of mixing in a sequential move game without any information sets.
RepeatedRepeated GamesGames • With the exception of our discussion of bargaining, we have examined the effect of repetition on strategic behavior in games. • If a game is played repeatedly, players may behave differently than if the game is a one-shot game. (e.g. borrow friend’s car versus rent-a-car). • Two types of repeated games: – Finitely repeated: the game is played for a finite and known number of rounds, for example, 2 rounds. – Infinitely or Indefinitely repeated: the game has no predetermined length; players act as though it will be played indefinitely, or it ends only with some probability.
Finitely Repeated Games • Writing down the strategy space for repeated games is difficult, even if the game is repeated just 2 rounds. For example, consider the finitely repeated game strategies for the following 2x2 game played just twice. L R U • For a row player: D –U1 or D1 Two possible moves in round 1 (subscript 1). – For each first round history pick whether to go U2 or D2 The histories are: (U1,L1)(U1,R1)(D1,L1)(D1,R1) 2 x 2 x 2 x 2 – 16 possible strategies!
Strategic Form of a 2-Round Finitely Repeated Game • This quickly gets messy! L L R 2 R 2 2 2 U 1 U 2 L 1 R 1 D 2 D 2 U 1 L 2 R 2 L 2 R 2 D 1 U 2 U 2 D 2 D 2
Finite Repetition of a Game with a Unique Equilibrium • Fortunately, we may be able to determine how to play a finitely repeated game by looking at the equilibrium or equilibria in the one-shot or “stage game” version of the game. • For example, consider a 2x2 game with a unique equilibrium, e.g. the Prisoner’s Dilemma • Does the equilibrium change if this game is played just 2 rounds?
A Game with a Unique Equilibrium Played Finitely Many Times Always Has the Same Subgame Perfect Equilibrium Outcome • To see this, apply backward induction to the finitely repeated game to obtain the subgame perfect Nash equilibrium (spne). • In the last round, round 2, both players know that the game will not continue further. They will therefore both play their dominant strategy of Confess. • Knowing the results of round 2 are Confess, Confess, there are no benefits to playing Don’t Confess in round 1. Hence, both players play Confess in round 1 as well. • As long as there is a known, finite end, there will be no change in the equilibrium outcome of a game with a unique equilibrium. Also true for zero or constant sum games.
Finite Repetition of a Stage Game with Multiple Equilibria. • Consider 2 firms playing the following one-stage game N>1 times, where N is known. What are the possible subgame perfect equilibria? • In the one-shot “stage game” there are 3 equilibria, Ab, Ba and a mixed strategy where both firms play A(a) with probability ẵ, where the expected payoff to each firm is 5/2 or 2.5.
Games with Multiple Equilibria Played Finitely Many Times Have Many Subgame Perfect Equilibria Some subgame perfect equilibrium of the finitely repeated version of the stage game are: 1. Ba, Ba, N times, N is an even number. 2. Ab, Ab, N times, N is an even number. 3. Ab, Ba, Ab, Ba, N times, N is an even number. 4. Aa, Ab, Ba N=3 rounds.
Strategies Supporting these Subgame Perfect Equilibria 1. Ba, Ba, Row Firm first move: Play B Second move: After every possible history play B. Avg. Payoffs: Column Firm first move: Play a (4, 1) Second move: After every possible history play a. 2. Ab, Ab, Row Firm first move: Play A Second move: After every possible history play A. Avg. Payoffs: Column Firm first move: Play b (1, 4) Second move: After every possible history play b. 3. Ab, Ba, Ab, Ba, Row Firm first round move: Play A Even rounds: After every possible history play B. Odd rounds: After every possible history play A. Avg. Payoffs: (5/2, 5/2) Column Firm first round move: Play b Even rounds: After every possible history play a Odd rounds: After every possible history play b.
What About that 3-Round S.P. Equilibrium? 4. Aa, Ab, Ba (3 Rounds only) can be supported by the strategies: Row Firm first move: Play A Second move: – If history is (A,a) or (B,b) play A, and play B in round 3 unconditionally. – If history is (A,b) play B, and play B in round 3 unconditionally. – If history is (B,a) play A, and play A in round 3 unconditionally. Column Firm first move: Play a Second move: – If history is (A,a) or (B,b) play b, and play a in round 3 unconditionally. – If history is (A,b) play a, and play a in round 3 unconditionally. – If history is (B,a) play b, and play b in round 3 unconditionally. Avg. Payoff to Row = (3+1+4)/3 = Avg. Payoff to Column: (3+4+1)/3 = 2.67. More generally if N=101 then, Aa, Aa, Aa, 99 followed by Ab, Ba is also a s.p. eq.
Why is this a Subgame Perfect Equilibrium? • Because Aa, Ab, Ba is each player’s best response to the other player’s strategy at each subgame. • Consider the column player. Suppose he plays b in round 1, and row sticks to the plan of A. The round 1 history is (A,b). – According to Row’s strategy given a history of (A,b), Row will play B in round 2 and B in round 3. – According to Column’s strategy given a history of (A,b), Column will play a in round 2 and a in round 3. • Column player’s average payoff is (4+1+1)/3 = 2. This is less than the payoff it earns in the subgame perfect equilibrium which was found to be 2.67. Hence, column player will not play b in the first round given his strategy and the Row player’s strategies. • Similar argument for the row firm.
Summary •A repeated game is a special kind of game (in extensive or strategic form) where the same one-shot “stage” game is played over and over again. •A finitely repeated game is one in which the game is played a fixed and known number of times. • If the stage game has a unique Nash equilibrium, this equilibrium is the unique subgame perfect equilibrium of the finitely repeated game. • If the stage game has multiple equilibria, then there are many subgame perfect equilibria of the finitely repeated game. Some of these involve the play of strategies that are collectively more profitable for players than the one-shot stage game Nash equilibria, (e.g. Aa, Ba, Ab in the last game studied).
Infinitely Repeated Games • Finitely repeated games are interesting, but relatively rare; how often do we really know for certain when a game we are playing will end? (Sometimes, but not often). • Some of the predictions for finitely repeated games do not hold up well in experimental tests: – The unique subgame perfect equilibrium in the finitely repeated ultimatum game or prisoner’s dilemma game (always confess) are not usually observed in all rounds of finitely repeated games. • On the other hand, we routinely play many games that are indefinitely repeated (no known end). We call such games infinitely repeated games, and we now consider how to find subgame perfect equilibria in these games.
Discounting in Infinitely Repeated Games • Recall from our earlier analysis of bargaining, that players may discount payoffs received in the future using a constant discount factor, d= 1/(1+r), where 0 < d < 1. – For example, if d=.80, then a player values $1 received one period in the future as being equivalent to $0.80 right now (dà$1). Why? Because the implicit one period interest rate r=.25, so $0.80 received right now and invested at the one-period rate r=.25 gives (1+.25) à$0.80 = $1 in the next period. • Now consider an infinitely repeated game. Suppose that an outcome of this game is that a player receives $p in every future play (round) of the game. • The value of this stream of payoffs right now is : $p (d + d2 + d3 + ) • The exponential terms are due to compounding of interest.
Discounting in Infinitely Repeated Games, Cont. • The infinite sum, d + d2 + d3 + converges to d 1 − d • Simple proof: Let x= d + d2 + d3 + Notice that x = d + d(d + d2 + d3 + ) = d+dx solve x=d+dx for x: x(1-d)=d, x= d/(1-d). • Hence, the present discounted value of receiving $p in every future round is $p[d/(1-d)] or $pd/(1-d) • Note further that using the definition, d=1/(1+r), d/(1-d) = [1/(1+r)]/[1-1/(1+r)]=1/r, so the present value of the infinite sum can also be written as $p/r. • That is, $pd/(1-d) = $p/r, since by definition, d=1/(1+r).
The Prisoner’s Dilemma Game (Again!) • Consider a new version of the prisoner’s dilemma game, where higher payoffs are now preferred to lower payoffs. C D C c,c a,b C=cooperate, (don’t confess) b,a d,d D D=defect (confess) • To make this a prisoner’s dilemma, we must have: b>c >d>a. We will use this example in what follows. C D C 4,4 0,6 Suppose the payoffs numbers 6,0 2,2 D are in dollars
Sustaining Cooperation in the Infinitely Repeated Prisoner’s Dilemma Game • The outcome C,C forever, yielding payoffs (4,4) can be a subgame perfect equilibrium of the infinitely repeated prisoner’s dilemma game, provided that 1) the discount factor that both players use is sufficiently large and 2) each player uses some kind of contingent or trigger strategy. For example, the grim trigger strategy: – First round: Play C. – Second and later rounds: so long as the history of play has been (C,C) in every round, play C. Otherwise play D unconditionally and forever. • Proof: Consider a player who follows a different strategy, playing C for awhile and then playing D against a player who adheres to the grim trigger strategy.
Cooperation in the Infinitely Repeated Prisoner’s Dilemma Game, Continued • Consider the infinitely repeated game starting from the round in which the “deviant” player first decides to defect. In this round the deviant earns $6, or $2 more than from C, $6-$4=$2. • Since the deviant player chose D, the other player’s grim trigger strategy requires the other player to play D forever after, and so both will play D forever, a loss of $4-$2=$2 in all future rounds. • The present discounted value of a loss of $2 in all future rounds is $2d/(1-d) • So the player thinking about deviating must consider whether the immediate gain of 2 > 2d/(1-d), the present value of all future lost payoffs, or if 2(1-d) > 2d, or 2 >4d, or 1/2 > d. • If ẵ < d < 1, the inequality does not hold, and so the player thinking about deviating is better off playing C forever.
Other Subgame Perfect Equilibria are Possible in the Repeated Prisoner’s Dilemma Game • The “Folk theorem” says that almost any outcome that on average yields the mutual defection payoff or better to both players can be sustained as a subgame perfect Nash equilibrium of the indefinitely repeated Prisoner’s Dilemma game. The set of subgame perfect Nash Equilibria, is the green area, as determined by average payoffs from all rounds played (for large enough discount factor, d). The mutual cooperation in all rounds Row Player Avg. Payoff equilibrium outcome is here. Column Player Avg. Payoff
Must We Use a Grim Trigger Strategy to Support Cooperation as a Subgame Perfect Equilibrium in the Infinitely Repeated PD? • There are “nicer” strategies that will also support (C,C) as an equilibrium. • Consider the “tit-for-tat” (TFT) strategy (row player version) – First round: Play C. – Second and later rounds: If the history from the last round is (C,C) or (D,C) play C. If the history from the last round is (C,D) or (D,D) play D. • This strategy “says” play C initially and as long as the other player played C last round. If the other player played D last round, then play D this round. If the other player returns to playing C, play C at the next opportunity, else play D. •TFT is forgiving, while grim trigger (GT) is not. Hence TFT is regarded as being “nicer.”
TFT Supports (C,C) forever in the Infinitely Repeated PD • Proof. Suppose both players play TFT. Since the strategy specifies that both players start off playing C, and continue to play C so long as the history includes no defections, the history of play will be (C,C), (C,C), (C,C), • Now suppose the Row player considers deviating in one round only and then reverting to playing C in all further rounds, while Player 2 is assumed to play TFT. • Player 1’s payoffs starting from the round in which he deviates are: 6, 0, 4, 4, 4, If he never deviated, he would have gotten the sequence of payoffs 4, 4, 4, 4, 4, So the relevant comparison is whether 6+d0 > 4+4d. The inequality holds if 2>4d or ẵ > d. So if ẵ < d < 1, the TFT strategy deters deviations by the other player.
TFT as an Equilibrium Strategy is not Subgame Perfect • To be subgame perfect, an equilibrium strategy must prescribe best responses after every possible history, even those with zero probability under the given strategy. • Consider two TFT players, and suppose that the row player “accidentally” deviates to playing D for one round – “a zero probability event” - but then continues playing TFT as before. • Starting with the round of the deviation, the history of play will look like this: (D,C), (C,D), (D,C), (C,D), Why? Just apply the TFT strategy. • Consider the payoffs to the column player 2 starting from round 2 6 + 0d + 6d 2 + 0d 3 + 6d 4 + 0d 5 + = 6 + 6(d 2 + d 4 + ) = 6 + 6d 2 /(1− d 2 ) = 6 /(1− d 2 ).
TFT is not be Subgame Perfect, cont’d. • If the column player 2 instead deviated from TFT, and played C in round 2, the history would become: (D,C), (C,C), (C,C), (C,C) • In this case, the payoffs to the column player 2 starting from round 2 would be: 4 + 4d + 4d 2 + 4d 3 = 4 + 4(d + d 2 + d 3 + ), = 4 + 4d /(1− d) = 4 /(1− d) • Column player 2 asks whether 6/(1− d 2 ) > 4(1− d) ⇔ 4d 2 − 6d + 2 > 0, which is false for any 1/ 2 < d <1. • Column player 2 reasons that it is better to deviate from TFT!
Must We Discount Payoffs? • Answer 1: How else can we distinguish between infinite sums of different constant payoff amounts? (All = ả ). • Answer 2: We don’t have to assume that players discount future payoffs. Instead, we can assume that there is some constant, known probability q, 0 < q < 1, that the game will continue from one round to the next. Assuming this probability is independent from one round to the next, the probability the game is still being played T rounds from right now is qT. – Hence, a payoff of $p in every future round of an infinitely repeated game with a constant probability q of continuing from one round to the next has a value right now that is equal to: $p(q+q2+q3+ ) = $p[q/(1-q)]. – Similar to discounting of future payoffs; equivalent if q=d.
Play of a Prisoner’s Dilemma with an Indefinite End • Let’s play the Prisoner’s Dilemma game studied today but with a probability q=.8 that the game continues from one round to the next. • What this means is that at the end of each round the computer draws a random number between 0 and 1. If this number is less than or equal to .80, the game continues with another round. Otherwise the game ends. • This is an indefinitely repeated game; you don’t know when it will end. •The expected number of rounds is 1/(1-q)=1/.2 = 5; In practice, you may play more than 5 rounds or less than 5 rounds, it just depends on the sequence of random draws.
Complete vs. Incomplete Information Games All games can be classified as complete information games or incomplete information games. Complete information games – the player whose turn it is to move knows at least as much as those who moved before him/her. Complete information games include: Perfect information games –players know the full history of the game, all moves made by all players etc and all payoffs e.g. an extensive form game without any information sets. Imperfect information games – games involving simultaneous moves where players know all the possible outcomes/payoffs, but not the actions chosen by other players. Incomplete information games: At some node in the game the player whose turn it is to make a choice knows less than a player who has already moved.
Imperfect vs. Incomplete Information Games • In a game of imperfect information, players are simply unaware of the actions chosen by other players. However they know who the other players are, what their possible strategies/actions are, and the preferences/payoffs of these other players. Hence, information about the other players in imperfect information is complete. •In incomplete information games, players may or may not know some information about the other players, e.g. their “type”, their strategies, payoffs or preferences.
Example 1a of an Incomplete Information Game • Prisoner’s Dilemma Game. Player 1 has the standard selfish preferences but Player 2 has either selfish preferences or nice preferences. • Player 2 knows her type, but Player 1 does not know 2’s type 22 C D C D 4,4 0,6 4,6 0,4 11C C D 6,0 2,2 D 6,2 2,0 Player 2 selfish Player 2 nice • Recall that C=cooperate, D=defect. If player 2 is selfish then player 1 will want to choose D, but if player 2 is nice, player 1’s best response is still to choose D, since D is a dominant strategy for player 1 in this incomplete information game.
Example 1b of an Incomplete Information Game • Prisoner’s Dilemma Game. Player 1 has the standard selfish preferences but Player 2 has either selfish preferences or nice preferences. Suppose player 1’s preferences now depend on whether player 2 is nice or selfish (or vice versa). 2 2 C D C D 4,4 0,6 6,6 2,4 1 C 1 C D 6,0 2,2 D 4,2 0,0 Player 2 selfish Player 2 nice • If 2 is selfish then player 1 will want to be selfish and choose D, but if player 2 is nice, player 1’s best response is to play C! • Be nicer to those who play nice, mean to those who play mean.
Example 1b in Extensive Form, Where Player 2’s Type is Due to “Nature” Information Set Prevents 1 From Knowing 2’s Type and 2’s Move
Example 1b Again, But With a Higher Probability that Type 2 is Selfish
Analysis of Example 1b • Player 2 knows his type, and plays his dominant strategy: D if selfish, C if nice. • Player 1’s choice depends on her expectation concerning the unknown type of player 2. – If player 2 is selfish, player 1’s best response is to play D. – If player 2 is nice, player 1’s best response is to play C. • Suppose player 1 attaches probability p to Player 2 being selfish, so 1-p is the probability that Player 2 is nice. • Player 1’s expected payoff from C is 0p+6(1-p). • Player 1’s expected payoff from D is 2p+4(1-p). •0p+6(1-p)= 2p+4(1-p), 6-6p=4-2p, 2=4p, p=1/2. • Player 1’s best response is to play C if p<1/2, D otherwise. • In first version, p=1/3, play C; in second, p=2/3, play D.
The Nature of “Nature” • What does it mean to add nature as a player? It is simply a proxy for saying there is some randomness in the type of player with whom you play a game. • The probabilities associated with nature’s move are the subjective probabilities of the player facing the uncertainty about the other player’s type. • When thinking about player types, two stories can be told. – The identity of a player is known, but his preferences are unknown. “I know I am playing against Tom, but I do not know whether he is selfish or nice.” Nature whispers to Tom his type, and I, the other player, have to figure it out. – Nature selects from a population of potential player types. I am going to play against another player, but I do not know if she is smart or dumb, forgiving or unforgiving, rich or poor, etc. Nature decides.
Example 2: Michelle and the Two Faces of Jerry J J Dancing Frat Party Dancing Frat Party Dancing 2,1 0,0 Dancing 2,0 0,2 MM Frat Party 0,0 1,2 Frat Party 0,1 1,0 Jerry likes company Jerry is a loner Assume that Jerry knows his true type, and therefore, which of the two games are being played. Assume Michelle attaches probability p to Jerry liking company and 1-p to Jerry being a loner. Big assumption: Assume Jerry knows Michelle’s estimate of p (assumption of a common prior).
The Game in Extensive Form
Bayes-Nash Equilibria Bayes-Nash equilibria is generalization of Nash equilibrium for an incomplete information game. 1. First, convert the game into a game of imperfect information. 2. Second, use the Nash equilibria of this imperfect information game as the solution concept. Apply this technique to the Michelle and Jerry Game. Michelle’s pure strategy choices are Dancing, D, or Party P. She can also play a mixed strategy, D with probability l. Jerry’s strategy is a pair, one for each “type” of Jerry: the first component is for the Jerry who likes company (Jerry type 1) and the second component is for Jerry the loner (Jerry type 2). Pure strategies for Jerry are thus (D,D), (D,P), (P,D), and (P,P). Jerry also has a pair of mixed strategies, m1 and m2 indicating the probability Jerry plays D if type 1or if type 2. Focus on pure strategies.
Pure Strategy Bayes-Nash Equilibria Suppose Michelle plays D for certain l=1. Type 1 Jerry plays D. Type 2 Jerry plays P: Jerry (D,P). Does Michelle maximize her payoffs by playing D against the Jerrys’ pure strategy of (D,P)? • With probability p, she gets the D,D payoff 2, and with probability 1-p she gets the D,P payoff, 0. So expected payoff from D against Jerry (D,P) is 2p. • If she played P against Jerry (D,P), she would get with probability p, the P,D payoff, 0 and with probability 1-p she gets the P,P payoff, 1. So expected payoff from P against Jerry (D,P) is 1-p. • Finally, playing D against Jerry (D,P) is a best response if 2p > 1-p, or if 3p >1, or if p > 1/3. If p>1/3, it is a Bayes-Nash equilibrium for Michelle to play D, while the Jerrys play (D,P).
Pure Strategy Bayes-Nash Equilibria, Contd. Next suppose Michelle plays P for certain, l=0. Type 1 Jerry plays P. Type 2 Jerry plays D: Jerry (P,D). Does Michelle maximize her payoffs by playing P against Jerrys’ pure strategy of (P,D)? • With probability p, she gets the P,P payoff 1, and with probability 1-p she gets the P,D payoff, 0. So expected payoff from P against Jerry (P,D) is p. • If she played D against Jerry (P,D), she would get with probability p, the D,P payoff, 0 and with probability 1-p she gets the D,D payoff, 2. So the expected payoff from D against Jerry (P,D) is 2(1-p). • Finally, playing P against Jerry (P,D) is a best response if p > 2(1-p), or if 3p >2, or if p > 2/3. If p>2/3, it is a Bayes-Nash equilibrium for Michelle to play P, while the Jerrys play (P,D).
Summary • If p > 2/3, there are 2 pure-strategy Bayes-Nash equilibria 1. Michelle plays D, the Jerrys play (D,P). 2. Michelle plays P, the Jerrys play (P,D). • If 2/3 > p > 1/3 there is just 1 pure strategy Bayes Nash equilibrium, #1 above, where Michelle plays D and the Jerrys play (D,P). In our example, p=1/2, so Michelle should play D and the two Jerrys play (D,P). • If p < 1/3 there is no pure strategy Bayes-Nash equilibrium.
Example 3: Market Entry With Unknown Number of Players • Another kind of incomplete information game is where the set of players you are playing against are unknown as in this example. Incumbent does not know if he faces 1 entrant or a joint venture involving 2 entrants, as indicated by the dashed information sets
Analysis of Example 3 • Three players, first entrant, second entrant, incumbent. • Consider the subgames that arise conditional on the second entrant’s strategy: accept or decline the invitation by first entrant for a joint venture. • Since the first entrant always moves first and the incumbent does not observe his moves, we can treat these subgames as 2 player, simultaneous-move games between the first entrant and the incumbent only. • There are two such games: If the second entrant declines, and if the second entrant accepts.
If the Second Entrant Declines 0 0 0 0 • Middle payoff of 0 is the second entrant’s payoff. • Evidently, the unique Nash equilibrium of this subgame is fight, stay out, yielding a payoff to the incumbant, second entrant, first entrant of (3,0,0).
If the Second Entrant Accepts 001 004 • The unique Nash equilibrium of this subgame is accommodate, propose, yielding a payoff to the incumbant, second entrant, first entrant of (0,4,4).
What Does the Second Entrant Do? • The second entrant compares the payoffs in the two Nash equilibria: (3,0,0) versus (0,4,4) • She would choose the equilibrium that yields (0,4,4), since her payoff in that equilibrium 4, is greater than the 0 payoff she gets in the other equilibrium. • Therefore, the second entrant accepts the first entrant’s proposal for a joint venture. • The unique Nash equilibrium of this incomplete information game is the strategy (accommodate, propose, accept) as played by the incumbant, first entrant, and second entrant.
? ? ? Signaling Games ? ? ? • In incomplete information games, one player knows more information than the other player. • So far, we have focused on the case where the “type” of the more informed player was known to that player but unknown to the less informed player. • Signaling games are incomplete information games where the more informed player has to decide whether to signal in some way their true type, and the less informed player has to decide how to respond to both the uncertainty about his opponent’s type and the signal his opponent has sent, recognizing that signals may be strategically chosen.
What are Signals? • Signals are actions that more informed players use to convey information to less informed players about the unobservable type of the more informed player. – Example: A player who wants the trust of less informed player may signal past instances of trust, may provide verbal assurances of trustworthiness, the names of character references/former employees on a resume, discuss church attendance, charity work, etc. • Signals may or may not be credible: Why? Because individuals will use signals strategically when it suits them. Less qualified applicants may “pad” their resumes, lie about their past work history/qualifications, embezzle from their church/charity. – Talk is cheap: “Yeah, right”; “whatever”; “I could care less” are common. – The more credible signals involve costly actions, e.g. a college diploma, an artistic portfolio, a published book, a successful business.
Examples of Strategic Signaling • Insurance contracts: Accident prone types will want greater coverage, lower deductibles, while those unlikely to be in accidents will require minimal coverage, higher deductibles. Insurance companies respond to these signals by charging higher prices for greater coverage/lower deductible. • Used cars: The dealer has to decide whether to offer a warranty on a used car or offer the car “as is.” • Pittsburgh left-turn game: The left-turner can attempt to signal whether he is a Pittsburgher or an Out-of-Towner. • Letter grade or pass/fail grades: Letter grade signals more commitment, risk-taking; pass grade signals lowest possible passing letter grade, C-.
Example 1: Prisoner’s Dilemma Again. • Recall the Prisoner’s Dilemma game from last week, where player 1’s preferences depend on whether player 2 is nice or selfish. 2 2 C D C D 4,4 0,6 6,6 2,4 1 C 1 C D 6,0 2,2 D 4,2 0,0 Player 2 selfish Player 2 nice • Suppose the player 2 can costlessly signal to player 1 her action choice before Player 1 gets to choose. The signal is nonbinding, “cheap talk.” Player 1 observes this signal before making his own move, but still does not know what type of player 2 he is facing, selfish or nice. • For example, if player 2 signals C, player 1 wants to play C if player 2 is nice, but D if player 2 is selfish.
Example 1 in Extensive Form Note the two information sets for player 1 (P1): Given a signal, C or D, P1 does not know if the player 2 (P2) is selfish or nice
Analysis of Example 1 • The signal is player 2’s cheap talk message of C or D. – “I intend to play C” or “I intend to play D” • Both player 2 types have an incentive to signal C. A selfish player 2 wants player 1 to play C so she can play D and get the highest possible payoff for herself. A nice player 2 wants player 1 to play C so she can play C and get the highest possible payoff for herself. • If the two types sent different signals, player 1 would be able to differentiate between the two types of player 2, and the game would then be like a game of complete information. • Therefore, both player 2 types signal C: the signal is perfectly uninformative; this is called a pooling equilibrium outcome. • Player 1 will play C if the prior probability that player 2 is selfish p ẵ. – In this example, since p =1/3 < ẵ, player 1 should play C.
Example 2: Market Entry Game with Signaling • Two firms, incumbent is Oldstar, the new firm is Nova. • Oldstar is a known, set-in-its-ways company, but Nova is a relatively unknown startup. Oldstar reasons that Nova is one of two types: “strong” or “weak.” Nova knows its type. • In a fight, Oldstar can beat a weak Nova, but a strong Nova can beat Oldstar. The winner has the market all to itself. • If Oldstar has the market to itself, it makes a profit of 3, and if Nova has the market to itself it makes a profit of 4. • The cost of a fight is –2 to both firms. These facts are reflected in the payoff matrix given to the right
The Equilibrium Without Signaling • Let w be the probability that Nova is weak, and so 1-w is the probability that Nova is strong. • In the absence of any signals from Nova, Oldstar will calculate the expected payoff from fighting, which is (w)1+(1-w)(-2)=3w-2, and compare it with the payoff from retreating which is 0. • If 3w-2 > 0, Oldstar’s best response is to fight, or in other words, Oldstar fights if: 3w > 2, or w > 2/3. ⇒ Oldstar fights only if its prior belief is that Nova is very likely to be weak, (chance is 2 out of 3).
Signaling in Example 2 • Suppose Nova can provide a signal of its type by presenting some evidence that it is strong, in particular, by displaying prototypes of its new advanced products before it has the ability to produce and distribute a large quantity of these products. • If it is unable to produce/distribute enough to meet market demand if it is “weak” , Oldstar may be able to copy the new products and quickly flood the market. But if Nova is “strong” and is ready to produce/distribute enough to meet market demand, it will squeeze Oldstar out of the market. • Nova’s signal choice is therefore to display the new products, or not display the new products. • Suppose it is costly for a weak Nova to imitate a strong Nova. The cost for a weak Nova to display, c, is common knowledge (along with w). The cost for a strong Nova to display is 0.
The Game in Extensive Form • Suppose w, the probability that Nova is weak is ẵ 1-w w 2-c3-c -2-c-2-c The cost c of displaying only applies to a weak Nova who chooses to display.
Separating Equilibrium • Suppose c > 2, for example, c=3.
Strong Novas Challenge and Display, Weak Don’t Challenge: There is Perfect Separation If Nova Challenges and Displays, Oldstar If Nova is weak, and c>2, knows Nova is strong Nova’s dominant strategy is because it knows c>2 not to challenge, because any and can infer that only challenge results in a negative strong Novas would payoff, even if Oldstar retreats. ever Challenge and Nova can get a 0 payoff from Display, and so Oldstar not challenging, so it does. always retreats in this case.
Pooling Equilibrium • Suppose w <2/3 and c < 2, for example, w=1/2 and c=1. Suppose Oldstar’s strategy is to retreat if Nova Challenges and Displays. If c < 2, even weak Novas get a positive payoff from challenging with a Display
Pooling Equilibrium Analysis • If w < 2/3 Oldstar retreats if it sees a Display. • If c < 2, Both types of Novas find it profitable to Challenge and Display because Oldstar will retreat– a pooling equilibrium. • If Oldstar fights, it gets w(1)+(1-w)(-2)=3w-2. • If w < 2/3, Oldstar’s expected payoff from fighting is negative.
Semi-separating Equilibria • Suppose c 2/3, for example, c=1 and w=3/4. • Neither a separating nor a pooling equilibrium is possible. ù Weak Novas challenge with some probability p.
How Does Oldstar React? • Oldstar draws inferences conditional on whether or not Nova displays. It does this according to Bayes Rule. • Oldstar responds to a display by fighting with probability q. Display Sum of Yes No Row Nova’s Strong 1-w 0 1-w True Weak wp w(1-p) w Type Sum of Column 1-w+wp w(1-p) • If Oldstar sees a display, with probability wp/(1-w+wp) Nova is weak, and with probability (1-w)/(1-w+wp) Nova is strong.
Semi-Separation Involves a Mixed Strategy • Oldstar’s expected payoff from fighting conditional on observing a display is: 1(wp/(1-w+wp) + (-2)[(1-w)/(1-w+wp)] =[wp-2(1-w)]/(1-w+wp) • Oldstar’s (expected) payoff from retreating is always 0. • So, Nova chooses p to keep Oldstar perfectly indifferent between fighting and retreating: [wp-2(1-w)]/(1-w+wp)=0 or [wp-2(1-w)]=0 p=2(1-w)/w.
Mixed Strategy, Continued • Given Oldstar’s strategy of fighting when it sees a display with probability q, a weak Nova’s expected payoff from challenging with a display is: q(-2-c)+(1-q)(2-c)=2-c-4q • A weak Nova’s (expected) payoff from not challenging is always 0. • So, Oldstar chooses q to keep a weak Nova perfectly indifferent between challenging with display and not challenging 2-c-4q=0 q=(2-c)/4. • Summary: Mixed strategy, semi-separating equilibrium is for weak Nova, to display with probability p=2(1-w)/w, and for Oldstar to challenge with probability q=(2-c)/4.
Summary: Equilibrium Types Depend on c and w Probability Nova is Weak, w w 2/3 c 2 Separating
Strategic Behavior in Elections and Markets • Can game theory have anything to say about behavior by participants in elections or markets? • We often imagine that, in such environments, individual actors cannot have any impact on outcomes, i.e. that they are “atomistic.” Consequently, it seems that strategic behavior doesn’t matter. • For example: – Polls show that my candidate is losing. If I vote, he will lose anyway, so there is no reason to vote. – The price of tickets to a Pirates game is too high, but by refusing to buy tickets, I, by myself, am unlikely to affect the price. • As we shall see, strategy can matter in elections and markets if the number of players is small enough.
Voting Games • Voting games involve groups of N people who decide some issue or choose some candidate by holding an election and counting votes. • There are two sides to voting games: – Possible strategic behavior among the voters themselves, if N is small. – If the voters are choosing candidates as in political elections, and the number of candidates is small, then the candidates have good reason to behave strategically. • The rules matter. The rules prescribe how the winning issue or winning candidate is determined. – Majority rule: issue/candidate with more than half of votes wins. – Plurality rule: issue/candidate with the highest frequency of votes wins: “first past the post.” • With 3 or more issues/alternatives/candidates, strategic voting behavior becomes an issue.
An Issue Voting Game • Pitt’s Board of Trustees is comprised of 13 members who have different positions on how much Pitt should increase tuition next year. Suppose there are three position types: • Type X, 4 out of 13, thinks Pitt should seek a high tuition increase in light of the state government’s decreasing appropriation in recent years. If a high increase cannot be achieved, X types prefer a medium increase to no increase at all. • Type Y, 5 out of 13, thinks Pitt should seek a medium tuition increase, since too high an increase might reduce student applicants and the state’s appropriation. If a medium increase is not possible, Y types prefer no increase at all to a high increase. • Type Z, 4 out of 13, thinks Pitt should seek no increase in tuition, and ask the state government to increase its appropriation as a reward. However, if it appears that the other members will vote for some tutition increase (high or medium), type Zs reason that the state appropriation will not be increased and so in that case Zs prefer a high increase to a medium increase.
The Voting Outcome Depends on the Rules • Let H=high, M=medium, N=no increase. Preferences are: 4/13 X : HMNff 5/13 YM: ff N H 4 /13 Z : N ffHM • If plurality rules, and members vote their first choice, then a medium increase in tuition wins, as 5/13 > 4/13, but it does not comprise a majority of the trustees’ positions on tuition increases. • If majority rules are in place, so that tuition policy is not implemented unless more than half vote for it, and voters are rational, they may behave strategically: – Suppose voting takes place in a single stage. If types X and Y vote for their first preference, and type Z strategically votes for its second preference, H, then H wins a strict majority of votes, 8/13, and type Z trustees are strictly better off than they would have been if M was the majority winner (imagine assigning payoffs based on preferences). – Of course, the other types, X and Y could also vote strategically, so that all 3 options, H, M, N could earn majority votes!
The Condorcet Rule • This rule, proposed by the Marquis de Condorcet, a French Enlightenment philosopher, circa 1775, involves a complete round-robin pairing of all issues/candidates. • The issue/candidate that wins majority votes in all pairwise matches is the Condorcet winner, a seemingly good approach to eliminating the problems with plurality voting. • Alas, this rule can lead to a paradox: one can devise situations where there is no Condorcet winner, leading to the so-called Condorcet paradox.