Max Planck Institute for Dynamics and Self-Organization -- Department for Nonlinear Dynamics and Network Dynamics Group
Personal tools
Log in

BCCN/BFNT AG-Seminar

Tuesday, 05.02.2013 17 c.t.

Is there "value" in reinforcement learning?

by Dr. Yonatan Loewenstein
from Department of Neurobiology, The Hebrew University, Jerusalem, Israel

Contact person: Fred Wolf

Location

Ludwig Prandtl lecture hall

Abstract

Behaviors that are followed by a reward are more likely to be repeated in the future, a phenomenon known as the “law of effect”. I will discuss two quantitative computational accounts of this law of behavior. The first assumes that the agent maintains a set of estimates of the expectation values of accumulated future rewards associated with the different states of the world or the different state-action pairs, and the decision at a state of the world depends on these values. Learning in this framework results from an on-line update of the values according to the actions and their consequences. I will show that in a repeated-choice setting, this framework provides a good quantitative description of human behavior if we assume that first experience resets the initial estimates values of the actions. In particular, I will focus on primacy and risk aversion. The second account of the law of effect posits that covariance-based synaptic plasticity underlies operant learning. I will show that in free operant setting, this covariance-based synaptic plasticity provides a good quantitative description of animal behavior and is consistent with the fast adaptation to matching behavior. I will conclude by contrasting the two approaches.

back to overview