Max Planck Institute for Dynamics and Self-Organization -- Department for Nonlinear Dynamics and Network Dynamics Group
Personal tools
Log in

BCCN AG-Seminar

Tuesday, 13.11.2007 17 c.t.

Learning to decide: the role of dopamine in planning and action

by Dr. Genela Morris
from Charité Berlin

Location

Seminarraum Haus 2, 4. Stock (Bunsenstr.)

Abstract

The basal ganglia are commonly viewed as a set of interconnected structures dealing with action-selection. In particular, it has been postulated that these structures are responsible for learning tasks that will result in forming habitually perform action sequences, and that this learning is achieved by reinforcement learning algorithms. A popular model describes this network as an actor-critic architecture, where the main axis serves as an actor, and the neuromodulators dopamine and acetylcholine acting on the actor are the critic. We recorded from such midbrain dopapamine neurons and from acetylcholinergic interneurons, both acting in the striatum in monkeys performing a probabilistic instrumental conditioning task. We show that while the responses dopamine neurons are consistend with the error signal in the temporal difference (TD) learning algorithm, cholinergic neurons are not, and are much more suitable for providing a timing signal for the learning window. We further explored the activity of dopamine neurons in a two-armed-bandit decision setting. These experiments revealed that the behavioral policy of the animals was better predicted by the activity of the dopamine neurons than by the received rewards, indicating that indeed learning is achieved through plastic changes induced by dopaminergic activity. Before each choice the activity of dopamine neurons were indicative of the future action as soon as 120 ms after presentation of the alternative. This suggests that dopamine neurons do not participate in the actual decision making. Rather, the decision process is performed elsewhere and conveyed to these neurons. Computationally, this means that an actor-critic model is not the most appropriate reinforcement learning method for describing learning in the striatum. Rather, methods that compute state-action values rather that state values are employed.Specifically, we show that the activity of the dopamine neurons are consistent with a SARSA learning algorithm.

back to overview