• Keine Ergebnisse gefunden

The Solution of a Two-Person Poker Variant

N/A
N/A
Protected

Academic year: 2022

Aktie "The Solution of a Two-Person Poker Variant"

Copied!
14
0
0

Wird geladen.... (Jetzt Volltext ansehen)

Volltext

(1)

THE SOLUTION OF A TWO-PERSON POKER VARIANT

Jean-Pierre Ponssard

June 1974

Working Papers are not intended for distribution outside of IIASA, and are solely for discussion and infor- mation purposes. The views expressed are those of the author, and do not necessarily reflect those of IIASA.

WP-74-16

(2)
(3)

THE SOLUTION OF A TWO-PERSON POKER VARIANT

by

Jean-Pierre Ponssard

International Institute for Applied Systems Analysis Laxenburg, Austria

(On leave from the Centre d'Enseignement Superieur du Management ~ublic, 94112, Arcueil, and from

Groupe de Recherche en Gestion des Organisations Ecole Po1ytechnique,

75005, Paris, France)

June 1974

(4)

ABSTRACT

This note presents the solution of a two-person poker variant considered by Friedman

[1].

The solution is derived using a general algorithm proposed by the author to solve two-person zero sum games with 'almost' perfect information

[2J .

(5)

1. Description of the Game

The following poker game is a slight generalization of Friedman's "Simple Bluffing Situation with Possible Reraise"

[lJ . ' Player 1 has a low card up and one card down, Player 11 has a high car~ up and one card down. If both players have

either a high or a low card down, then Player 11 wins; otherwise, the player with the high card down wins. There are n units in the pot. Player 1 may either drop or raise 1 unit. Then Player 11 may either drop, call or reraise (m-l) units. Finally, Player 1 may either drop or call (in Friedman's example n=l, m=4).

Let p and q be the respective probabilities that Player 1

o 0

and Player 11 have a high card down. Of course, each player knows whether his own card down is high or low.

2. Computation of the Value of the Game

The solution will be derived using a general algorithm proposed by the author to solve two person zero sum games with

'almost' perfect information[2J. For convenience we shall use the same notation as in [2J.

Let the letters D) C, R stand for drop, call or raise respectively and let m1E{D,R}, m

2E{D,C,R}, m

3E{D,C}. Let Vffilm

2m

3

be the value of the ml-m 2-m

3

restricted game; that is, the game in which the players' choice sets are restricted to the unique elements m

l ,m

2,ffi

3

respectively. Then we have (theorem 1 in

[2})

(6)

k

-2--

ym·l ID2

=

Cav{Max VIDIID2ID3} ~

Or m 3

yml

=

Vex Min{yIDI ID2}

,

Orr m2

and finally: V

=

Cav Max{yID1 } Or ml

We shall make the computation stage by stage and represent the functions on the unit square (0 ~ p ~ 1, 0 : q ~ 1). rt will turn out that all functions will be "rectangle wise" linear (of the form apq+8P+Aq+6 on rectangles) so that only the values at the extremal points of the rectangles need be cOIDputed.

For the computation of the optimal strategies~ it will also be helpful to keep track of how the Cav and Vex are constructed.

This will be done by labeling the corresponding vertices of the rectangles. (For instance, for qE[0,(m+n+l)/(2ID+n)] ,yRR is a convex cOIDbination of yRRD at p = 0 and yRRC at p

=

1).

(7)

-3-

n+l (H,1) n+2

-1 -J.

-1 R R -1

-1

-1 C h D n (1,H) VRRC (H.,H) (1,H) VRR

m -m -1

I

D

I

m+n+l -1

I 2m+n

II

I

C

~m n+m -1

(1,1)~ (H,1) (1.,1)

D D -1 -1

n+m

(H,H) (H,H)

m+n+l 2m+n

(H,1) (1,1) l+n

(H,H) (1,H)

-1

-1

(1,1) (1,H)

-1

(1,H)* VRRD (H,H)

-1 -1

I

I

1 -1

J

(1,1) (H,L)

(1,H) VRD (H,H)

n

~

I

i,

!

i

;

.

i

n n !l

J

(1,1) (H,1)

(1,H)

0 0

D

0 0 0 n(m+n+l)

(n+l)(2m+n)

F R

0 n n

(1,H)

o

o

(H,H)

o

o

D

D

v

(H,H)

(L,1) (H,1) (L,1) (H,1)

*

(Player l's hidden card, Player 11's hidden card)

(8)

-4-

3.

Computation of the Optimal Behavioral Strategies 3.1 Player lIs first optimal move

For p E: [0, (ntl)/(nt2)] and q E: [O,n(mtntl)/(ntl) (2mtn)]

o 0

Step 1

It is easily seen that

with

,

Thus,

So that Player lIs optimal move may be written as follows

Player 1 Prob (mlIL) Prob (mlIH)

ml

=

D (tD/(l-Po) 0

ml

=

R (tR(l-PR)/(l-po) 1 For p

oE:[(ntl)/(nt2),

lJ

and qoE:[O, n(ntm+l)/(n+l)(2m+n)]

V(po,qo)

=

VR(po,qo)

Hence Player 1 raises independently of his state.

,

For qo~[0, n(n+mtl)/(n+l) (2m+n)]

V(po,qo)

=

VD(po,qo)

,

Hence Player 1 drops independently of his state.

(9)

-5-

3.2 Player II's optimal first move

Given that Player 1 raised, Player 11 may either drop, call or reraise.

q £[0,n(m+n+l)/(n+l(2m+n)]

o

k=2

V (PR,qo)R =

with qo

=

Step 1

Since Player 1 raised, we have PI = PRo It is easily seen that we have two extremal Bayesian best responses for Player 11.

k=l

R 1 RD 1 RR

V· (PR, qo)

=

f3DV (PR,O) + f3RV (PR,(m+m+l)/(2m+n))

,

with 131

0 1

qo = + f3

R(m+n+l)/(2m+n) D

Thus 0, 1

(m+n+l) / (2m+n) , 1

q (2m+n)/(m+n+l), 81

1-131

qD

=

qR

=

t3R

=

0 D

=

R

yl Prob (m2!L) Prob ( m2

I

H)

1m

2 = D

6~/(1-qo)

0

1m

2 = R

6~(1-q~)/(1-qo)

1

2 RC 2 RR

f3cV (PR,O) + f3

RV (PR,(m+n+l)/(2m+n)),

2 2

f3C 0 + t3

R(m+n+l)/(2m+n)

Thus qc = 0,

q~

= (m+n+l)/(2m+n),

f3~

= qo(2m+n)/(m+TI+l),

f3~

=

2 Prob ( m2

I

L ) Prob ( m2

I

H)

y

-

m OJ

2 = ,.,v 8

C

/(1-qo) 0

R 2 2

m2 = t3R(l-qR)/(l-qo) 1

(notice that

q~

=

q~

and

f3~

=

f3~

so that the k index may be dropped).

(10)

-6-

Step 2

We now have to find the convex combination of these two Bayesian best responses which is in equilibrium with Player lIs first move (that is, which makes him indifferent between bluffing or not, if he has a low card).

The supporting hyperplane to V(p,qo) for PE[O,PRJ has for equation:

Y = [n- qo(n+l)(2m+n)/(m+n+l)] (n+2)p/(n+l)

The hyperplanes associated with the two Bayesian best responses are easily identified since yl is a Bayesian best response for pE: CPR ,OJ and y2 for PE [0 ,PRJ Thus

Yl

=

n- qo(n+l)(2m+n)/(m+n+l),

Y2

=

[1-qO(2m+n)/(m+n+l)] (n+2)p - 1 So that

. III

=

1/(n+l)[}-qo(2m+n)/(m+n+l)], 112

=

l-lll

Player ll's optimal strategy may be interpreted as follows:

- if he has a high hand, he reraises,

- otherwise, he reraises with probability SR(l-qR)/(l-qo) or, given that he does not reraise, then he will drop with probability lll' and call with probability 112.

For p c:[0,(n+l)/(n+2)] and q E[0,n(n+m+l)/(n+l)(2m+n)]o 0 Step 1

Since Player 1 raised independently of his state, Pl = po.

(11)

-7-

There is only one Bayesian best response for Player 11; it is yl as described on page

6,

thus it is Player ll's optimal first move.

3.3 Player lIs optimal move

The procedure used in 3.2 may be repeated. We shall only give the final result.

Fo r PoE: [0, (1.1+1 )I (n +2

)J

!IP1:?Yf'T' 1 Prob (m

I -

3

I

L ) Prob ( m 3IH)

I I

Irl3 = D 1 I

,

(m-l)/(m+n+l)

I

F

l3 = C 0 (n+2)/(m+m+l)

For poE:[(n+l)/(n+2),OJ

Player 1 Frob (m

3!L) I

Prob (m 3

I

H)

I .

I

[n3 = D 1. ,

l-(n+l)/(m+n+l)p

I

i

0

m

3 = C 0

I

(n+l)/(m+n+l)p

r

0

While the optimal strategies may appear complicated, the description of the "story" of the game in terms of the graph of conditional probabilities is quite simple. Here is such a story for po£f

o

,n+

2

:J]"

q E:[0,n(n+m+l)/(n+l)(2m+n)]

L

1.1+

U

0

(12)

-8-

(L,H) (m-l)(n+l)/m(n+2)

I I I I I I

~D C ....

-

11\ r

R---

:,.... D R ....

"- I

.,

I

I D,C

I

I ,It

m+n+l 2m+n

(L,L) (n+l)/(n+2) (H,L)

Starting with probability distributions Po =

~

and Po =

~

an observer to the'game could derive the following conditional probabilities:

- Player 1 drops, he has a low card,

- Player 1 raises of n units, the probability that he has a high card jumps from Po to n+2 ;n+l

- Player 2 calls or drops, he has a low card,

- Player 2 raises of m units, the probability that he m+n+l

has high cards jumps from qo to 2m+n - Player 1 calls, he has a high card;

- Player 1 drops, the probability that he has a high card falls from nn++1

2 to (m-l)(n+l) m (n+2) .

This sequence of conditional probabilities and the knowledge of (~1'~2) full~ describe the optimal behavorialst~ategies.

(13)

Ordinarily the conditional probabilities would be sufficient, except that here they do not completely specify Player II's strategy

4.

Some Comments on Computational Feasibility

The use of this algorithm for real poker is severely limited by the fact that so far no numerical procedure is available for the Cav and Vex operators in more than two dimensions. Concavifications have to be carried out by hand using "visual judgments". On the other hand, the number of reraises and their amounts may be quite arbitary with no further complications.

(14)

-10-

REFERENCES

[lJ Friedman, L. J "Optimal Bluffing Strategies in Poker"

Management Science, Vol. 17, No. 12, Aug. 1971

[2J

Ponssard, J.P., "Zero Sum Games with 'Almost' Perfect Information" to appear in Management Science.

Referenzen

ÄHNLICHE DOKUMENTE

Based on an existing framework, which will be described in more detail in Section 6.1, a solution archive for the Reconstruction of Cross-Cut Shredded Text Documents (RCCSTD)

Using only the average number of moves and number of solved instances we can clearly separate MMAS and extended run time 2D-LPFH as the overall best performing test cases. Using

We illustrate this with a discrete cake-cutting procedure, somewhat like the gap procedure, which gives players the incentive to be truthful about their 50-50 points but fails

This, in my opinion and the opinion of others that I’ll quote in just a second, will be a tragedy for Israel because they will either have to dominate the

Uffe Ellemann-Jensen (Denmark) Chairman, Baltic Development Forum; former Foreign Minister Ine Eriksen Søreide (Norway) Member of Parliament; Chair of the Foreign Affairs

Problem O49, Mathematical Re‡ections 3/2007 Cezar Lupu and Darij

Greitzer, Geometry Revisited, Mathematical Association of America: New Mathematical Library, volume

Original proposal of Mathematical Re‡ections problem O25 / Darij Grinberg.. The following problem submission made it into the periodical