• Keine Ergebnisse gefunden

A Note on the L-P Formulation of Zero-Sum Sequential Games with Incomplete Information

N/A
N/A
Protected

Academic year: 2022

Aktie "A Note on the L-P Formulation of Zero-Sum Sequential Games with Incomplete Information"

Copied!
11
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

A NOTE ON THE L - P FORMULATION O F ZERO-SUM S E Q U E N T I A L GAMES WITH INCOMPLETE INFORMATION

Jean-Pierre P o n s s a r d

September 1974

R e s e a r c h M e m o r a n d a a r e i n f o r m a l p u b l i c a t i o n s r e l a t i n g t o ongoing o r p r o j e c t e d areas of re- search a t I I A S A . T h e v i e w s expressed a r e t h o s e of t h e a u t h o r , and do n o t n e c e s s a r i l y r e f l e c t t h o s e of I I A S A .

(2)
(3)

A Note on the L-P Formulation of Zero-Sum Sequential Games with Incomplete Information

Jean-Pierre Ponssard*

Abstract

Zero-sum games with incomplete information are formulated as linear programs in which the players' behavioral strategies appear as primal and dual

variables. Known properties for these games may then be derived from duality theory.

1. Introduction

It has been known for long that any zero-sum game defined in normal form (i.e. by the payoff matrix) is equivalent to a linear program in which the variables represent the players' mixed strategies (Dantzig [I]). However, many games of interest are usually defined in extensive form (i.e. by the game tree), and then the exponential explosion of the number of pure strategies makes the normal form a pure theoretical tool

inadequate for computational purposes. In the extensive form, the number of variables increases only linearly with respect to the number of information sets so that any computational

procedure based on this representation is especially attractive.

The objective of this note is to show that a s ~ e c i a l class of games defined in extensive form, namely zero-sum sequential games with incomplete information (Ponssard-Zamir 121 ), may indeed be directly formulated as linear programs in which the variables represent the players' behavioral strategies. Apart from its computational interest, a side product of this formula- tion is a new proof for the properties of these games.

*on leave from tha Centre d'Enseignement ~up6rieur du Management Public, 94112, Arcueil, and from Groupe de Gestion des Organisations Ecole Polytechnique, 75005, Paris, France;

research scholar at the International Institute for Applied Systems Analysis, Laxenburg, Austria.

(4)

2. Recall of the Definition of the Game

The game essentially consists of four steps (a full description may be obtained in

[

2 ] ).

step 0 Chance selects a move k ~ I l , - * . ~ K l according to a probability distribution p E ( ~ k : pk 0 > 0;

Step 1 Player 1 selects a move i c i l D * - * ,I} knowing k.

Step 2 Player 2 selects a move j ~ { l , - - * , J ) knowing i but not k.

Final Step Player 2 pays an amount ai to Player 1.

k j is a real number) (VkDiDj : akj

3. The L-P Formulation

3.1 Definitions of the Variables

For all k and i define Player 1's behavioral strategy by

xk = Prob (move ilm0ve k) D i

and his expected security level conditional on move i by ui.

For all i and j define Player 2's behavioral strategy by

yi = Prob (move j lmove i j

and his expected security level conditional on move k by vk.

(5)

3.2 Player 2's Problem

subject to

3.3 Player 1's Problem I

Max X u i i=1

subject to

(6)

3.4 A Comment on the Size of the Problem

Note that the dimensions of the matrix associated with these linear programs are (KxI+I) x (K+IxJ) as opposed to IK x J' if we were to "reduce" the game to its normal form.

4. Property of the Value of the Game

Let the variables (Zkli i = l,...,~; k = l,...,~ be defined as

Then it is immediate that by this transformation, Player 1's problem is the dual of Player 2's problem (recall that for all k, pk 0 > 0). Hence, denoting optimal values of the variables by a bar, we obtain from the duality theory

Thus the game has a value which is equal to the optimal values of the objective functions.

A property of this value is that it may be obtained from the concave hull of the value of an auxiliary game (see Theorem 1, page 101 in

[

2

1).

We shall now show that this property may

(7)

b e d e r i v e d d i r e c t l y from o u r L.P. f o r m u l a t i o n .

L e t u s make a change o f v a r i a b l e s i n P l a y e r 1 ' s problem.

D e f i n e

and l e t t h e new v a r i a b l e s

b e s u c h t h a t

i

SO t h a t pi = ( p k l k =

l,...

, K i s a p o i n t i n P, and i f X i = 0 , l e t pi b e a r b i t r a r y i n P.

F o r a l l p o i n t s p i n P and i = l , * * * , I , l e t t h e f u n c t i o n wi ( p ) b e

w i ( p ) = Min

*

i

a k j P k j = 1 , * * * , J k = l s o t h a t

i i

Xiw ( p ) = Min i o k

=

a k j P k X i j = l , * * * , J k = l

(8)

Then P l a y e r 1 ' s p r o b l e m may b e w r i t t e n a s t h e f o l l o w i n g n o n - l i n e a r program: F i n d a convex c o m b i n a t i o n ( X i ) i =

l , . . . , I

and I p o i n t s ( P i = 1,.

. .

, I i n P s u c h t h a t

l i i

Max C Xiw ( p ) i= 1

s u b j e c t t o

L e t w ( p ) d e n o t e t h e c o n c a v e h u l l o f t h e f u n c t i o n w ( p ) d e f i n e d a s w ( p ) = Max wi ( p )

.

w ( p ) may b e i n t e r p r e t e d

i = l , * * . , I

a s t h e v a l u e , a s a f u n c t i o n o f p , o f t h e game i n which P l a y e r 1 moves w i t h o u t knowing k (see s t e p 2 i n s e c t i o n 3 ) . The o p t i m a l v a l u e o f t h i s n o n - l i n e a r program, and t h u s t h e v a l u e o f t h e game, may t h e n b e e x p r e s s e d a s

w ( p

0 ) .

5. P r o p e r t y o f t h e O p t i m a l S t r a t e g i e s

The c o m p l e m e n t a r y s l a c k n e s s c o n d i t i o n s a s s o c i a t e d w i t h t h e two l i n e a r p r o g r a m s g i v e a t t h e optimum

a n d

-1

-

i 0-k

C y j ( u i - C a p x ) = O

,

j = l k= 1 kj k i

w h i c h , combined t o g e t h e r , g i v e

(9)

i = I , . . .

- -

o-k

, I t u i = k:lvkpkXi

o r i n t e r m s o f t h e new v a r i a b l e s d e f i n e d i n s e c t i o n 4

The i n t e r p r e t a t i o n o f t h i s e q u a l i t y i s a s f o l l o w s . I f P l a y e r 2 knew P l a y e r 1 ' s o p t i m a l s t r a t e g y

,

f o r a l l moves i which may o c c u r w i t h p o s i t i v e p r o b a b i l i t y

(Xi

> 01, he may compute a p o s t e r i o r p r o b a b i l i t y d i s t r i b u t i o n on k

(pi)

and

s e l e c t h i s s t r a t e g y s o a s t o minimize P l a y e r 1 ' s e x p e c t a t i o n s i -i

c o n d i t i o n a l on move i ( w ( p ) )

.

On t h e o t h e r hand, P l a y e r 2 ' s s e c u r i t y l e v e l a s s o c i a t e d w i t h h i s o p t i m a l s t r a t e g y and e v a l u a t e d

-

-i

a t

pi

i s I: v k p k . Thus, t h e e q u a l i t y i s t h e s p e c i a l f o r m u l a t i o n k= 1

i n t h e c o n t e x t of t h i s game o f t h e g e n e r a l minimax s t a t e m e n t t h a t P l a y e r 2 c a n n o t b e n e f i t from knowing P l a y e r 1 ' s o p t i m a l s t r a t e g y .

(10)

6. Example

As an illustration, Player 2's linear program for the example presented in [ 2

1

with the specification that

Min 1 2

T V 1 + TV2

s.t. 5 1 1

1

-

7 Y 1 + 2y2

1 1

v 2

-

lYl

-

l0y2

Y: + Y, 1

(11)

References

[l] Dantzig, G.B. "A Proof of the Equivalence of the Programming Problem and the Game Problem," in Activity Analysis of Production and Allocation, Koopmans, T.C., ed., Cowles Commission Monograph 13, Wiley, 1951.

[2] Ponssard, J.-P. and Zamir, S. "Zero Sum Sequential Games with Incomplete Information," Int. J. of Game Theory,

2,

No. 2 (1973).

Referenzen

ÄHNLICHE DOKUMENTE

In contrast, existing estimators for gen- eral complete information games, such as Bajari, Hong, and Ryan (2010) for the static case and Maruyama (2009) for the sequential-move

Note: One can show that if a Muller game is union-closed, and x ∈ V is winning for some player , then has a positional winning strategy from x.. Exercise 2: Gale-Stewart games as

Show how to construct a counter machine of dimension d ⩾ 2 with two control states q 0 , q f such that there is a transition sequence from (q 0 , n, m,.. Explain

The positional determinacy of parity games, a deep result from game theory, states that exactly one of the players can enforce that she wins the game, and in fact do so in a

Then the mathematical counterpart of Aumann and Maschler's argument for zero sum games in extensive form may be stated as follows: at the information sets of a game tree, the

playing move i in state k and Player 2's optimal behavioral strategy will maximize the expected penalty given the a priori probability distribution on k(pO = (P~)kEK) under

The maximum number r of shift minimal winning vectors of a complete simple game with n voters can indeed be exponential in n, see [5] for an exact formula for the maximum value of

In the second part, we turn to constrained games and we commence (in Section 4) by showing that the Nash equilibria are the intersection of the unconstrained stationary points with