Beauty‐contest game, bounded rationality and learning
Marta Serra‐Garcia, University of Munich
Behavioral and Experimental Economics, SS 2011 – June 29th, 2011
2
Outline
1. Beauty‐contest game – Nagel (1995)
– Bosch‐Domenech et al. (2002) – Camerer (2003, Ch. 5.2)
2. Bounded rationality
– Steps of reasoning or level‐k models (Camerer et al., 2004)
3. Learning
– Camerer (2003, Ch. 6)
1. Beauty‐contest game
• In many economic situations, individuals have to guess what others will do
• …and then figure out the best choice to make
• As noted by Keynes (1936), an example is the stock market – Investors have to figure out what stock to buy or sell – These decisions depend on
• The stock‘s fundamental value
• But also, what others will do
1. Beauty‐contest game
• He compares the stock market to beauty contests, run in newspapers in Keynes‘ time
– Readers who guessed which 6 of 100 faces would be evaluated as prettiest by other readers won a prize
“It is not a case of choosing those which, to the best of oneʹs judgment, are really the prettiest, nor even those which average opinion genuinely
thinks the prettiest. We have reached the third degree where we devote our intelligences to anticipating what average opinion expects the average opinion to be. And there are some, I believe, who practise the
fourth, fifth and higher degrees.”
Keynes (1936, p. 156) reproduced in Camerer et al (2004)
1. Beauty‐contest game
• Aims
– How do people make decisions in such enviroments?
• Behavior of students but also other groups – Do they follow a similar ‚reasoning approach‘?
• In steps or levels
– How do they learn over time?
1. Beauty‐contest game
• First examined experimentally by Nagel (1995)
• The game:
– N decision makers
– Simultaneously choose a real number from I [0,100]
– Winner:
• decision maker whose number is closest to pm
• p > 0 is fixed
• m denotes a particular order statistic (often mean or median)
1. Beauty‐contest game
• Consider p=2/3 and m=mean
• Follow the process of eliminating weakly dominated strategies
• Note that the highest possible mean (winning number) is 100*p=66⅔
• Guesses in (66⅔,100] are weakly dominated by 66⅔ – Why?
1. Beauty‐contest game
• If there is common knowledge of rationality…
– Then any guess between (444/9, 100] is also weakly dominated by 444/9
• Why? You and others know that ‚any other player‘ will not guess a number bigger than 66⅔
– In turn, any guess between (444/9)* ⅔ ≈29.63 and 100 is also weakly dominated
– Following this process, one ends with 0
• Considering the mean as the relevant order statistic, the game is dominance solvable
• The process of iterated elimination of weakly dominated strategies leads to a unique equilibrium (which is zero, if p < 1).
1. Beauty‐contest game
• Nagel (1995) has 3 treatments 1. p=1/2
2. p=2/3 3. p=4/3
• In all treatments, m=mean
• Sessions had 15‐18 subjects (playing each other)
• Prize=20 DM
• 4 rounds
– After each round all chosen numbers, mean, p*mean were written on blackboard
1. Beauty‐contest game
• How do students make decisions in the first round?
p=1/2
1. Beauty‐contest game
• How do students make decisions in the first round?
p=2/3
1. Beauty‐contest game
• How do students make decisions in the first round?
p=4/3
1. Beauty‐contest game
p=4/3 p=2/3
p=1/2
1. Beauty‐contest game
• Conclusion:
– Student guesses strongly deviate from game‐theoretic solution
– Guesses differ widely, but seem to be concentrated in ‚focal points‘, potentially related to steps of reasoning
• Open questions:
– Do other subject pools behave similarly?
– What would the ‚steps of reasoning‘ approach imply for choices?
1. Beauty‐contest game
• Research question: do individuals, who are not students, also behave similarly?
• Two studies
1. Bosch‐Domenech et al (2002, AER): Financial newspapers 2. Camemer (2003, 5.2): miscellaneous groups
1. Beauty‐contest game
• Bosch‐Domenech et al (2002, AER)
• Experimental Design (p=2/3)
1. Beauty‐contest game
• Bosch‐Domenech et al (2002, AER)
• Results
1. Beauty‐contest game
• Bosch‐
Domenech et al (2002, AER)
• Results 2 – other groups
1. Beauty‐contest game
• Camerer (2003, 5.2): miscellaneous groups
1. Beauty‐contest game
Conclusion
• Newspaper respondents come closer to the Nash Equilibrium than students
– Why?
• Stakes?
• Selection?
• Collusion?
• Theorists also seem to come closer…
• But in all cases there is wide variation!
2. Bounded rationality
Research question:
• Is there a similar reasoning process behind these choices?
• Several papers argue that individuals reason in ‚steps‘
– Nagel (1995) in guessing game
– Stahl and Wilson (1995) applied to different games – and Camerer et al (2004): Cognitive Hierarchy model
• What would be a step‐0 in the guessing game?
2. Bounded rationality
• In guessing game:
– Step‐0: chooses 50 (may be a salient point) – Step‐1: chooses 50p
– Step‐2: chooses 50p2
• In general:
– Step‐0: often assumed to choose randomly or what is salient – Step‐1: best‐responds to step‐0
– Step‐2: best responds to:
• Step‐1 only, in Nagel (1995) and Stahl and Wilson
• Mixture of step‐1 and step‐0, in Camerer et al (2004)
2. Bounded rationality
• To classify types Nagel (1995) builds an interval around the choice of each ‚step‐n‘ player
• If p<1
– For step‐0, the interval is [50p(0+1/4),50]
– For step‐1, the interval is [50p(1‐1/4), 50p(1+1/4)] – ...
• And classifies the choices into these intervals
2. Bounded rationality
• p=1/2
• p=1/3
2. Bounded rationality
• Conclusion:
– The model of reasoning in steps seems to capture the
‚nature‘ of the guessing game well – Most subjects are classified into
• Step 1
• Step 2
– Step – 0 often thought of as a type that ‚could‘ exist
2. Bounded rationality
Research question:
• Is decision making in the brain (activation) similar or consistent with this theory of reasoning?
• Coricelli and Nagel (2009, PNAS)
• Subjects in a scanner, make decisions in a beauty contest
• 13 different p‘s in different orders in 3 treatments – Play against 9 other subjects
– Play against computer (which randomly draws 9 other nrs.) – Guess a number
3 seem to decide ‚randomly‘
2. Bounded rationality
• Classify individuals into ‚levels‘
10 subjects
‐ against computer, close to L(1)
‐ against others, close to L(1)
7 subjects
‐ against computer, close to L(1)
‐ against others, close to L(2) or higher
2. Bounded rationality
• Are their brain activities different?
medium prefrontal cortex:
‐related to 3rd person perspective taking and thinking about similar others rostral anterior cingulate:
‐related to self‐refential thinking in social cognitive tasks
2. Bounded rationality
• Conclusion:
– Low and high reasoning subjects also found in the scanner – Importantly, their brain activity seems to differ in some
dimensions
• High reasoning individuals reveal a stronger activation of the medial prefrontal cortex
• This reasoning process or model of bounded rationality is static
– What happens if individuals repeat choices in the guessing game?
– How can we explain changes in their choices? Learning!
3. Learning
• Nagel (1995) has 4 rounds
• How do guesses change across these rounds?
3. Learning
• From round 1 to round 4 choices change
substantially!
• By round 4, the average guess per session is below 17 (between 16.7 and 3.2)
• Compared to round 1,
where is its 32.9 or higher, there is evident learning – though variation persists...
3. Learning
Camerer et al (2002), page 319
3. Learning
• We observe substantial learning in the guessing game
• Also in other games
• How can we explain learning? What information do subjects use to learn and update their strategies?
• Three main models:
1. Reinforcement learning (6.2) 2. Belief learning (6.3)
3. Experience‐Weighted Attraction (EWA) learning (6.6)
3. Learning
• Three main models:
1. Reinforcement learning
• Intuition: strategies that yield high payoffs are likely to be played more
• Information used: own choices and payoffs in the past 2. Belief learning
• Intuition: past play of others probably indicates their future play
• Information used: other‘s choices
3. Experience‐Weighted Attraction (EWA) learning
• Intution: both past own choices and other‘s choices are important, as well as forgone payoffs
• Information used: combine both models from above
3. Learning
• What learning models often have in common
1. Attractions: in vague words, how ‚important‘ a strategy is
• In the first period:
– All strategies have same value
– Or the empirical frequencies are used
• In later periods:
– Attractions are updated based on previous payoffs and choices
2. Transformation of attractions into probabilities:
– Using a simple ratio or a power function, for example
3. Reinforcement Learning
• Main idea: by repeating behavior, strategies that are ‚succesful‘ are reinforced more strongly that those that are not
• Simplest version (based on Roth and Erev, 1995) – Suppose player i chooses strategy j ( )
• This strategy yields payoff π , (t))
• The attraction in period t is then:
1 + π , (t)) – Other strategies are ‚not reinforced‘
1
• They allow for a local experimentation parameter
– This implies that strategies in the neighborhood also experience some reinforcement
3. Reinforcement Learning
Example
• Suppose we take the guessing game with p=2/3 and guesses [0,10,20,…,100]
Period 1 Strategy A(1) p(1)
0 10 9%
10 10 9%
20 10 9%
30 10 9%
40 10 9%
50 10 9%
60 10 9%
70 10 9%
80 10 9%
90 10 9%
100 10 9%
3. Reinforcement Learning
Example
• Suppose we take the guessing game with p=2/3 and guesses [0,10,20,…,100]
Period 1 Case 1 for Period 2 Strategy A(1) p(1) Results A(2) p(2)
0 10 9% 10 9%
10 10 9% 10 9%
20 10 9% 10 9%
30 10 9% W 10 9%
40 10 9% 10 9%
50 10 9% C 10 9%
60 10 9% 10 9%
70 10 9% 10 9%
80 10 9% 10 9%
90 10 9% 10 9%
100 10 9% 10 9%
3. Reinforcement Learning
Example
• Suppose we take the guessing game with p=2/3 and guesses [0,10,20,…,100]
Period 1 Case 1 for Period 2 Case 2 for Period 2 Strategy A(1) p(1) Results A(2) p(2) Results A(2) p(2)
0 10 9% 10 9% 10 8%
10 10 9% 10 9% 10 8%
20 10 9% 10 9% 10 8%
30 10 9% W 10 9% W & C 30 23%
40 10 9% 10 9% 10 8%
50 10 9% C 10 9% 10 8%
60 10 9% 10 9% 10 8%
70 10 9% 10 9% 10 8%
80 10 9% 10 9% 10 8%
90 10 9% 10 9% 10 8%
100 10 9% 10 9% 10 8%
3. Reinforcement Learning
• Camerer (2003, p.320)
3. Belief Learning
• Main idea: players learn by best responding in period t to what others did in the past
• Beliefs in period t about past behavior are
t , ∑ ∙ ,
1 ∑
• If 0, individuals use Cournot best‐response dynamics
• If 1, individuals use all previous periods as in original fictitious play
3. Belief Learning
Example
• Suppose we take the guessing game with p=2/3 and guesses [0,10,20,…,100]
• 10 players
Period 1
Strategy Choices Winning nr. Results
0 1 x 0
10 1 x 10
20 1 x 20 W
30 2 x 30 40 1 x 40
50 3 x 50 22.67 C 60 1 x 60
70 0
80 0
90 0
100 0
3. Belief Learning
Example
• Suppose we take the guessing game with p=2/3 and guesses [0,10,20,…,100]
• 10 players
Period 1 Period 2 Period 3
Strategy Choices Winning nr. Results BR Results BR Results
0 1 x 0
10 1 x 10 W 10 C & W
20 1 x 20 W 20 C
30 2 x 30 40 1 x 40
50 3 x 50 22.67 C 60 1 x 60
70 0
80 0
90 0
100 0
3. Belief Learning
• Camerer (2003, p.319)
3. EWA Learning
• Hybrid model: incorporates two elements into attractions – Experience, as in reinforcement models
– A weighted payoff term
• Updating occurs for strategies that are chosen
• …and also those that are not!
• Notation:
– N(t)= 1 1 1, 1/ 1 1 ]
• Attractions:
1 1 1 , ,
3. EWA Learning
1 1 1 , ,
• Both reinforcement and belief learning are nested in this model:
– Reinforcement, if δ=0 and N(0)=1 – Belief learning, if δ=1 and κ=0
Past experience
Payoffs
1) If δ=0, only those of strategies played count 2) If δ=1, both received and forgone payoffs count
3. EWA Learning
• Camerer (2003, p.319)
3. Learning
• Comparing models in different settings – Reinforcement models
• Tend to predict a slower pace of learning than actually observed (Roth and Erev, 1995)
• Perform well in 2x2 games with mixed‐strategy equilibria (Roth et al. 1999)
• Fail in market games or dominance solvable games
– Belief learning models
• Perform well in market games and dominance solvable games
– EWA combines both and often fits better
4. Conclusion
• Beauty‐contest games:
– Reveal that individuals often make choices far away from the Nash Equilibrium
1. Limited steps of reasoning
2. Common knowledge of rationality
• Behavior is consistent with a model of reasoning by steps:
• Step‐0, step‐1, step‐2
• Over time, choices in beauty‐contest game move towards the Nash Equilibrium – learning is important
– But how do people learn?
• Reinforcement models
• Belief learning models
• EWA