Comparative Ratios - Complexity results for some classes of strategic games

46 4 ·Ranking Games and computes two bits of outputo= (o₁, o₂), given by o₁ =C(a₁, a₂, . . . , a_m)and o₂ = (o₁∨(a_xXORa_y)).

The possible outputs of the circuit are identified with permutations of the players inNsuch that the permutationπ₀₀corresponding too= (0, 0)and the permutation π₁₁ corresponding to o = (1, 1) both rank x first and y last, the permutation π₀₁ corresponding too= (0, 1) ranksyfirst andxlast, and all other players are ranked in the same order in all three permutations. It should be noted that no matter how permutations are actually encoded as strings of binary values, the encoding of the above permutations can always be computed using a polynomial number of gates.

We now claim that, for arbitrary rank payoffs, Γ has a pure Nash equilibrium if and only ifChas a satisfying assignment. This can be seen as follows:

If(a₁, a₂, . . . , a_m)is a satisfying assignment ofC, only a player in{1, 2, . . . , m}could possibly change the outcome of the game by changing his action. However, these players are ranked in the same order in all the possible outcomes, so none of them can get a higher payoff by doing so. Thus, every action profile(a₁, a₂, . . . , a_m, a_x, a_y) such that(a₁, a₂, . . . , a_m) is a satisfying assignment ofC is a Nash equilibrium.

If in turn (a₁, a₂, . . . , a_m) is not a satisfying assignment of C, both x and y are able to switch between outcomes π₀₀ and π₀₁ by changing their own action. Since further every player strictly prefers being ranked first over being ranked last, x strictly prefers outcome π₀₀ over π₀₁, while ystrictly prefers π₀₁ over π₀₀. Thus, (a₁, a₂, . . . , a_m, a_x, a_y)cannot be a Nash equilibrium, since eitherxorycould play a different action to get a higher payoff.

4.6 ·Comparative Ratios 47 4.6.1 The Price of Cautiousness

A compelling question is how much worse off a player can be when if he were to revert to his most defensive course of action—his maximin strategy—instead of hoping for an equilibrium outcome. This difference in payoff can be represented by a numerical value which we refer to as the price of cautiousness. In the following, letGdenote the class of all normal-form games, and forΓ ∈G, letN(Γ)be the set of Nash equilibria of Γ. Further recall thatv_i(Γ) denotes playeri’s security level in gameΓ.

Definition 4.9 (price of cautiousness). Let Γ be a normal-form game with non-negative payoffs, i∈Na player such that v_i(Γ) > 0. The price of cautiousness for player i inΓ is defined as

PC_i(Γ) = min{p_i(s_N) :s_N∈N(Γ)} v_i(Γ) .

We further write PC_i(C) = sup_Γ∈_CPC_i(Γ), where C⊆ G can be any class of games involving player i. In other words, the price of cautiousness of a player is the ratio between his minimum payoff in a Nash equilibrium and his security level. It thus captures the worst-case loss the player may incur by playing his maximin strategy instead of a Nash equilibrium.² For a player whose security level equals his minimum payoff of zero, every strategy is a maximin strategy. Since we are mainly interested in a comparison of normative solution concepts, we will thus only consider games where the security level of at least one player is positive.

As we have already mentioned, Nash equilibrium and minimax strategies coincide in two-player ranking games by virtue of the Minimax Theorem of von Neumann (1928), so the price of cautiousness equals1 for these games. In general ranking games, on the other hand, the price of cautiousness is unbounded.

Theorem 4.10. Let R be the class of ranking games with more than two players that involve player i. Then, the price of cautiousness is unbounded, i.e., PC_i(R) =∞, even if R only contains games without weakly dominated actions.

Proof. Consider the game Γ₁ of Figure 4.8, which is a ranking game for rank payoff vectors ~p₁ = (1, , 0), ~p₂ = (1, 0, 0), and ~p₃ = (1, 1, 0), and rankings [2, 3, 1], [1, 3, 2], [1, 2, 3], [2, 1, 3], and [3, 1, 2]. It is easily verified that none of the actions of Γ₁ is weakly dominated and that v₁(Γ₁) = . Let further s_N = (s₁, s₂, c¹) be the strategy profile where s₁ and s₂ are uniform mixtures of a¹ and a², and of b¹ and b², respectively. It is easily verified that s_N is a Nash equilibrium of Γ₁, and we will argue that it is in fact the only one. For this, consider the possible strategies of player 3. If player 3 playsc¹, the game reduces to the well-known Matching Pennies game for players1 and2, the only Nash equilibrium being the one described above. If on the other hand player 3 playsc²,

2In our context, the choice of whether to use the worst or the best equilibrium when defining the price of cautiousness is merely a matter of taste. All results in this section still hold when the best equilibrium is used instead of the worst one.

48 4 ·Ranking Games c¹

b¹ b²

a¹ (0, 1, 1) (1, 0, 0) a² (1, 0, 1) (0, 1, 1)

c²

b¹ b²

(, 1, 0) (, 0, 1) (, 1, 0) (, 0, 1)

Figure 4.8: Three-player ranking gameΓ₁ used in the proof of Theorem 4.10 c¹

b¹ b² a¹ 2 1 a² 1 2

c² b¹ b²

3 1

1 1

Figure 4.9: Three-player single-winner game used in the proof of Theorem 4.11. Dotted boxes mark all Nash equilibria, one player may mix arbitrarily in boxes that span two outcomes.

action b¹ strictly dominates b². If b¹ is played, however, player 3 will deviate to c¹ to get a higher payoff. Finally, if player3 randomizes between actions c¹ andc², the payoff obtained from both of these actions must be the same. This can only be the case if either player 1 plays a¹ and player 2 randomizes between b¹ and b², or if player 1 plays a² and player 2 playsb². In the former case, player2 will deviate to b¹. In the latter case, player 1 will deviate to a¹. Since the payoff of player 1 in the above equilibrium is1/2, we have PC(Γ₁) =1/(2)→∞ for→0.

We proceed to show that, due to their structural limitations, the price of cautiousness inbinary ranking games is bounded from above by the number of actions of the respective player. We also derive a matching lower bound.

Theorem 4.11. Let Rb be the class of binary ranking games with more than two players involving a player i with exactly k actions. Then, PC_i(Rb) =k, even if Rb

only contains single-winner games or games without weakly dominated actions.

Proof. By definition, the price of cautiousness takes its maximum for maximum payoff in a Nash equilibrium, which is bounded by1 in a ranking game, and minimum security level. We require the security level to be strictly positive, so for every opponent action profiles_−i∈S_−ithere is some actiona_i∈A_isuch thatp_i(a_i, s_−i)> 0, i.e.,p_i(a_i, s_−i) = 1. It is then easily verified that player i can ensure a security level of 1/k by uniform randomization over hisk actions. This results in a price of cautiousness of at mostk.

For a matching lower bound, consider the single-winner game depicted in Figure 4.9.

We will argue that all Nash equilibria of this game are mixtures of the action profiles

4.6 ·Comparative Ratios 49 c¹

b¹ b²

a¹ (0, 1, 1) (1, 0, 0) a² (1, 0, 0) (0, 1, 0)

c²

b¹ b²

(0, 1, 0) (1, 0, 0) (1, 0, 1) (1, 0, 1)

Figure 4.10: Three-player ranking gameΓ₂ used in the proof of Theorem 4.11 (a², b¹, c²), (a², b², c²) and (a¹, b², c²). Each of these equilibria yields payoff 1 for player 1, twice as much as his security level of 1/2. To appreciate this, consider the possible strategies for player 3. If player 3 playsc¹, the game reduces to the well-known Matching Pennies game for players 1and 2, in which they will randomize uniformly over both of their actions. In this case, player 3will deviate toc². If player3playsc², we im-mediately obtain the equilibria described above. Finally, if player 3 randomizes between actions c¹ and c², the payoff obtained from both of these actions should be the same.

This can only be the case if either player1 playsa² and player 2randomizes betweenb¹ and b², or if player1randomizes between a¹ anda² and player2playsb². In the former case, player 2will play b², causing player 1to deviate to a¹. In the latter case, player 1 will play a¹, causing player2 to deviate to b¹.

The above construction can be generalized tok > 2by virtue of a single-winner game with actionsA₁ ={a¹, a², . . . , a^k}, A₂ ={b¹, b², . . . , b^k}, and A₃={c¹, c²}, and payoffs

p(aⁱ, b^j, c^`) =









(0, 1, 0) if`=1 and i6=k−j+1 (0, 0, 1) if`=2 and i=j=1 (1, 0, 0) otherwise.

It is easily verified that the security level of player 1in this game is1/k, while, by similar arguments as above, his payoff in every Nash equilibrium equals 1. This shows tightness of the upper bound of kon the price of cautiousness for single-winner games.

Now consider the game Γ₂ of Figure 4.10, which is a ranking game for rank payoff vectors ~p₁ = ~p₂ = (1, 0, 0) and ~p₃ = (1, 1, 0), and rankings [2, 3, 1], [1, 2, 3], [2, 1, 3], and [1, 3, 2]. It is easily verified that none of the actions of Γ₂ is weakly dominated and that v₁(Γ₂) =1/2. On the other hand, we will argue that all Nash equilibria ofΓ₂ are mixtures of action profiles (a², b¹, c²) and (a², b², c²), corresponding to a payoff of 1for player 1.

To see this, we again look at the possible strategies for player 3. If player 3 plays c¹, players1and2will again randomize uniformly over both of their actions, causing player3 to deviate to c². If player 3 plays c², we immediately obtain the equilibria described above. Finally, assume that player 3 randomizes between actions c¹ and c², and let α denote the probability with which player 1 playsa¹. Again, player 3must be indifferent between c¹ and c², which can only hold for1/26α61. In this case, however, player2 will deviate tob¹.

50 4 ·Ranking Games This construction can be generalized to k > 2 by virtue of a game with actions A₁ = {a¹, a², . . . , a^k},A₂ ={b¹, b², . . . , b^k}, and A₃={c¹, c²}, and payoffs

p(aⁱ, b^j, c^`) =











(0, 1, 1) ifi=j=`=1

(1, 0, 0) if`=1and i=k−j+1 or`=2,i=1 and j > 1 (1, 0, 1) if`=2and j > 2

(0, 1, 0) otherwise.

Again, it is easily verified that player1 has a security level of1/k, while his payoff is1 in every Nash equilibrium by similar arguments as above. Thus, the upper bound ofkfor the price of cautiousness is tight as well for binary ranking games without weakly dominated actions.

Informally, the previous theorem states that the payoff a player with k actions can obtain in Nash equilibrium can be at mostk times his security level.

4.6.2 The Value of Correlation

Nash equilibrium is based on the assumption that players select their actions indepen-dently from each other. Aumann (1974) generalizes the notion of a strategy profile by allowing players to coordinate their actions by means of a device or agent that randomly selects one of several action profiles and recommends the actions of this profile to the respective players. More formally, a correlated strategy µ∈∆(A_N) is a probability dis-tribution over the set of action profiles. The corresponding equilibrium concept is then defined as follows.

Definition 4.12 (correlated equilibrium). A correlated strategy µ ∈ ∆(A_N) is called a correlated equilibrium if for all i∈Nand all a^∗_i, a_i∈A_i,

a−i∈A_−i

µ(a_−i, a^∗_i)(p_i(a_−i, a^∗_i) −p_i(a_−i, a_i))>0.

In other words, a correlated equilibrium of a game is a probability distribution µover the set of action profiles, such that, if a particular action profile a^∗_N ∈ A_N is chosen according to this distribution, and every player i ∈ N is only informed about his own action a^∗_i, it is optimal in expectation for every player i ∈ N to play a^∗_i, given that he only knows the conditional distribution over values ofa^∗_−i. Correlated equilibrium makes stronger assumptions than Nash equilibrium in that it assumes the existence of a trusted third party who can recommend behavior, but cannot enforce it. Using cryptographic means, this requirement can essentially be reduced to the ability to carry out a distributed computation among the players (Dodis et al., 2000).

4.6 ·Comparative Ratios 51 It can easily be seen that every Nash equilibrium naturally corresponds to a corre-lated equilibrium. Nash’s existence result thus carries over to correcorre-lated equilibria. Again consider the game of Figure 4.1 on Page 31. The correlated strategy that assigns proba-bility 1/4each to action profiles (a¹, b¹, c¹), (a¹, b², c¹),(a², b¹, c¹), and (a², b¹, c²) is a correlated equilibrium, with an expected payoff of1/2for player1and1/4for players2 and 3. In this particular case, the correlated equilibrium is a convex combination of Nash equilibria, and correlation can be achieved by means of a publicly observable random vari-able. Quite surprisingly, Aumann (1974) has shown that in general the (expected) social welfare of a correlated equilibrium may exceed that of every Nash equilibrium, and that correlated equilibrium payoffs may in fact be outside the convex hull of the Nash equilib-rium payoffs. This is of course not possible if social welfare is identical in all outcomes, as is the case in our example.

We will now turn to the question whether, and by which amount, social welfare in a ranking game can be improved by allowing players to correlate their actions. Just as the payoff of a player in any Nash equilibrium is at least his security level, social welfare in the best correlated equilibrium is at least as high as social welfare in the best Nash equilibrium. In order to quantify the value of correlation in strategic games with non-negative payoffs, Ashlagi et al. (2005) introduce themediation valueof a game as the ratio between the maximum social welfare in a correlated versus that in a Nash equilibrium, and the enforcement value as the ratio between the maximum social welfare in any outcome versus that in a correlated equilibrium. Whenever social welfare, i.e., the sum of all players’ payoffs, is used as a measure of global satisfaction, one implicitly assumes the inter-agent comparability of payoffs. While this assumption is controversial, social welfare is nevertheless commonly used in the definitions of comparative ratios such as the price of anarchy (Koutsoupias and Papadimitriou, 1999). For Γ ∈G and X ⊆∆(A_N), let C(Γ) denote the set of correlated equilibria of Γ and let v_X(Γ) =max{p(s_N) :s_N∈X}. Recall that N(Γ) denotes the set of Nash equilibria of gameΓ.

Definition 4.13 (mediation value, enforcement value). Let Γ be a normal-form game with non-negative payoffs. Then, the mediation value MV(Γ) and the enforcement value EV(Γ) of Γ are defined as

MV(Γ) = v_C(Γ)(Γ)

v_N(Γ)(Γ) and EV(Γ) = v_S_N(Γ) v_C(Γ)(Γ).

If both numerator and denominator are0 for one of the values, the respective value is defined to be1. If only the denominator is0, the value is defined to be∞. For any class C⊆Gof games, we further writeMV(C) =sup_Γ∈_CMV(Γ) andEV(C) =sup_Γ∈_CEV(Γ).

Ashlagi et al. (2005) have shown that both the mediation value and the enforcement value cannot be bounded for games with an arbitrary payoff structure, as soon as there are more than two players, or some player has more than two actions. This holds even if payoffs are normalized to the interval[0, 1]. Ranking games also satisfy this normalization criterion, and here social welfare is also strictly positive for every outcome of the game.

52 4 ·Ranking Games c¹

b¹ b²

a¹ (1, 1, 0) (1, 0, 0) a² (0, 1, 0) (0, 1, 1)

c²

b¹ b²

(0, 1, 1) (0, 1, 0) (1, 0, 0) (1, 1, 0)

c³

b¹ b²

(1, 0, 0) (0, 0, 1) (0, 0, 1) (1, 0, 0) Figure 4.11: Three-player ranking gameΓ₃ used in the proof of Theorem 4.14 Ranking games with identical rank payoff vectors for all players, i.e., ones wherep^k_i =p^k_j for all i, j∈N and16k6n, are constant-sum games. Hence, social welfare is the same in every outcome so that both the mediation value and the enforcement value are1. This in particular concerns all ranking games with two players. In general, social welfare in an arbitrary outcome of a ranking game is bounded byn−1 from above and by1 from below. Since the Nash and correlated equilibrium payoffs must lie in the convex hull of the feasible payoffs of the game, we obtain trivial lower and upper bounds of1and n−1, respectively, on both the mediation and the enforcement value. It turns out that the upper bound ofn−1 is tight for both the mediation value and the enforcement value.

Theorem4.14. LetR⁰ be the class of ranking games with n > 2players, such that in games with only three players at least one player has more than two actions. Then, MV(R⁰) =n−1.

Proof. It suffices to show that for any of the above cases there is a ranking game with mediation valuen−1. Forn=3, consider the gameΓ₃ of Figure 4.11, which is a ranking game for rank payoff vectors ~p₁ =~p₃ = (1, 0, 0) and ~p₂ = (1, 1, 0). First of all, we will argue that every Nash equilibrium of this game has social welfare1, by showing that there are no Nash equilibria where c¹ or c² are played with positive probability. Assume for contradiction thats^∗_N is such an equilibrium. The strategy played by player3ins^∗_Nmust either be (i)c¹ orc² as a pure strategy, (ii) a mixture of c¹ andc³ or betweenc² and c³, or (iii) a mixture where both c¹ and c² are played with positive probability. If player 3 plays a pure strategy, the game reduces to a two-player game for players 1 and 2. In the case of c¹, this game has the unique equilibrium (a¹, b¹), which in turn causes player 3 to deviate toc². In the case ofc², the unique equilibrium is (a², b²), causing player 3to deviate toc¹. Now assume that player3mixes betweenc¹andc³, and let αandβdenote the probabilities with which players1and 2playa¹ andb¹, respectively. Since player3’s payoff from c¹ and c³ must be the same in such an equilibrium, we must either have α= β = 1, in which case player 3 will deviate to c², or 0 6α6 1/2and 06 β61/2, causing player 2 to deviate to b¹. Analogously, if player 3 mixes between c² and c³, we must either have α= β = 0, in which case player 3 will deviate to c¹, or 1/2 6 α 61 and 1/26β61, causing player2 to deviate to b². Finally, if both c¹ and c² are played with positive probability, we must haveα+β= 1 for player3 to get an identical payoff of αβ61/4from both c¹ and c². In this case, however, player 3can deviate to c³ for a

4.6 ·Comparative Ratios 53 c¹

b¹ b²

a¹ (1, 1, 0, 1) (1, 0, 0, 0) a² (0, 1, 0, 0) (0, 1, 1, 1)

c²

b¹ b²

(0, 1, 1, 1) (0, 1, 0, 0) d¹ (1, 0, 0, 0) (1, 1, 0, 1)

a¹ (0, 0, 0, 1) (0, 0, 0, 1) a² (0, 0, 0, 1) (0, 0, 0, 1)

(0, 0, 0, 1) (0, 0, 0, 1) d² (0, 0, 0, 1) (0, 0, 0, 1)

Figure 4.12: Four-player ranking gameΓ₄ used in the proof of Theorem 4.14 strictly greater payoff of 1−2αβ. Thus, a strategy profiles^∗_N as described above cannot exist.

Now let µ^∗ be the correlated strategy where action profiles (a¹, b¹, c¹), (a², b², c¹), (a¹, b¹, c²), and(a², b², c²)are played with probability1/4each. This correlation can for example be achieved by tossing two coins independently. Players1and2observe the first coin toss and play a¹ and b¹ if the coin falls on heads, and a² and b² otherwise. Player 3 observes the second coin toss and plays c¹ if the coin falls on heads and c² otherwise.

The expected payoff for player 2 underµ^∗ is 1, so he cannot gain by changing his action.

If player1observes heads, he knows that player2will playb¹, and that player3will play c¹ and c² with probability 1/2each. He is thus indifferent between a¹ and a². Player3 knows that players 1 and 2 will play (a¹, b¹) and (a², b²) with probability 1/2each, so he is indifferent between c¹ and c² and strictly prefers both of them to c³. Hence, none of the players has an incentive to deviate, µ^∗ is a correlated equilibrium. Moreover, the social welfare under µ^∗ is2, and thusMV(Γ₃) =2.

Now consider the four-player gameΓ₄ of Figure 4.12, which is a ranking game for rank payoffs~p₁ =~p₃ = (1, 0, 0, 0),~p₂ = (1, 1, 0, 0), and~p₄= (1, 1, 1, 0), and rankings[1, 2, 4, 3], [1, 3, 2, 4], [3, 2, 4, 1], [2, 3, 1, 4], and [4, 1, 2, 3]. It is easily verified that none of the action profiles with social welfare2 is a Nash equilibrium. Furthermore, player 4strictly prefers action d² over d¹ as soon as one of the remaining action profiles for players 1 to 3, i.e., those in the upper half of the game where the social welfare is 1, is played with positive probability. Hence, d¹ is not played with positive probability in any Nash equilibrium ofΓ₄, and every Nash equilibrium ofΓ₄has social welfare1. In turn, consider the correlated strategy µ^∗ where actions profiles (a¹, b¹, c¹, d¹), (a², b², c¹, d¹), (a¹, b¹, c², d¹), and (a², b², c², d¹) are played with probability1/4each. It is easily verified that none of the players can increase his payoff by unilaterally deviating fromµ^∗. Hence,µ^∗ is a correlated equilibrium with social welfare 3, and MV(Γ₄) =3.

For n > 4, we can restrict our attention to games where the additional players only have a single action. We return to the game Γ₄ of Figure 4.12 and transform it into a

54 4 ·Ranking Games c¹

b¹ b²

a¹ (1, 0, 0) (0, 1, ) a² (0, 1, ) (1, 0, 1)

c²

b¹ b²

(0, 0, 1) (1, 0, ) (1, 0, ) (1, 0, )

Figure 4.13: Three-player ranking gameΓ₅ used in the proof of Theorem 4.15 game Γ₄ⁿ with n > 4 players by assigning to players 5, 6, . . . , n a payoff of 1 in the four action profiles (a¹, b¹, c¹, d¹), (a², b², c¹, d¹), (a¹, b¹, c², d¹), and (a², b², c², d¹) that constitute the correlated equilibrium with maximum social welfare, and a payoff of zero in all other action profiles. Since the additional players cannot influence the outcome of the game, this construction does not affect the equilibria of the game. To see that the resulting game is a ranking game, consider the rank payoff vectors ~p₁ = ~p₃ = (1, 0, 0, . . . , 0),

~p₂ = (1, 1, 0, . . . , 0),r^k_m =1 ifk6m−1 and 0 otherwise, for m>4. It is easily verified that we can retain the original payoffs of players 1 to 4 and at the same time assign a payoff of 0 or 1, respectively, to players 5 to n by ranking the latter according to their index and placing either no other players or exactly one other player behind them in the overall ranking. More precisely, Γ₄ⁿ is a ranking game by virtue of the above rank payoffs and rankings[1, 2, 4, 5, . . . , n, 3],[1, 3, 2, 4, 5, . . . , n],[3, 2, 4, 5, . . . , n, 1],[2, 3, 1, 4, 5, . . . , n], and [4, 1, 2, 3, 5, . . . , n]. Furthermore,MV(Γ₄ⁿ) =n−1.

Theorem 4.15. Let R be the class of ranking games with n > 2 players. Then, EV(R) =n−1, even if R only contains games without weakly dominated actions.

Proof. It suffices to show that for any n >3 there is a ranking game with enforcement value n−1 in which no action is weakly dominated. Consider the ranking game Γ₅ of Figure 4.13, which is a ranking game by virtue of rank payoff vectors ~p₁ = (1, 1, 0),

~p₂ = (1, 0, 0), and~p₃ = (1, , 0)and rankings[1, 2, 3],[2, 3, 1],[3, 1, 2],[3, 2, 1], and[1, 3, 2].

Obviously, all of the actions of Γ₅ are undominated and v_S_N(Γ₅) = 2. It remains to be shown that the social welfare in any correlated equilibrium ofΓ₅ is at most (1+), such thatv_C(Γ

5)(Γ₅)→1 and EV(Γ₅)→2 for→0.

Finding a correlated equilibrium that maximizes social welfare constitutes a linear programming problem constrained by the inequalities of Definition 4.12 and the prob-ability constraints P

aN∈A_Nµ(a_N) = 1 and µ(a_N) > 0 for all a_N ∈ A_N. Feasibility of this problem is a direct consequence of Nash’s existence theorem. Boundedness follows from boundedness of the quantity being maximized. To derive an upper bound for social welfare in a correlated equilibrium ofΓ₅, we will transform the above linear program into its dual. Since the primal is feasible and bounded, the primal and the dual will have the same optimal value, in our case the maximum social welfare in a correlated equi-librium. The latter constitutes a minimization problem and finding a feasible solution with objective value v shows that the optimal value cannot be greater than v. Since

Im Dokument Complexity results for some classes of strategic games (Seite 62-71)