dp11-20.pdf (160.61 kB)
Download fileProbability Matching and Reinforcement Learning
report
posted on 2011-04-20, 13:58 authored by Javier RivasProbability matching occurs when an action is chosen with a frequency equivalent
to the probability of that action being the best choice. This sub-optimal behavior has
been reported repeatedly by psychologist and experimental economist. We provide an
evolutionary foundation for this phenomenon by showing that learning by reinforcement
can lead to probability matching and, if learning occurs su ciently slowly, probability
matching does not only occur in choice frequencies but also in choice probabilities. Our
results are completed by proving that there exists no quasi-linear reinforcement learning
speci cation such that behavior is optimal for all environments where counterfactuals are
observed.