Better Than Thought: Learning, Dopamine and Neuroplasticity
The dopamine system is associated with various disorders. For example, if a part of the brain called the striatum lacks dopamine, a person shows the symptoms of Parkinson’s disease: tremor while at rest, muscle and limb rigidity and difficulty with initiating movement (akinesia). By contrast, too much dopamine causes hyperkinesias.
Furthermore, in schizophrenia a lack of dopamine in the frontal brain is associated with schizophrenic “negative” symptoms, i.e. the absence of normal social or interpersonal behaviours, like a low level of emotional arousal (flat affect). By contrast, too much dopamine with schizophrenia causes “positive” symptoms, i.e. the presence of abnormal behaviours, such as delusions, hallucinations or disordered thoughts.
Dopamine also plays an important role in the aetiopathogenesis of another psychiatric disorder: addiction. Addictive drugs, which are used in an addictive way, and addictive behaviours cause an increase in frontal dopamine levels. For example, while it is known that drugs like amphetamine and cocaine cause a massive increase in the secretion of frontal dopamine, so too can alcohol, nicotine, or the win from a “one-armed bandit”. In addition, animal studies have shown that behaviours associated with reward can be impaired through lesions of the frontal dopamine system or through a pharmacological blockade thereof. These findings indicate that dopamine plays an important role in the reward system of the brain.
What is this good for? It is easier to ask this question than to answer it, because it presupposes the comprehension of a few principles of associative learning. First, the principle of operant conditioning: if a stimulus is paired with a reward or punishment, the organism learns to couple this stimulus with the reward or punishment and can adapt its behaviour accordingly to the predictive reward value of the stimulus. Stimuli with negative consequences are avoided, while stimuli with positive consequences are approached. Further investigations have shown that the brain systems for reward and punishment are different ones, and that dopamine is only involved in the reward system. Furthermore, it has been proven that, for optimal learning, it is not the absolute value of a reward that matters, but its unexpectedness: whenever the organism has a certain expectation and the result of the behaviour is better than the expectation, the organism is learning. The same applies to the behaviour of dopaminergic neurons: they do not fire as a response to a certain reward, but as a response to the difference between predicted and actual reward. More precisely: dopamine is only secreted in the frontal brain when the consequence is better than expected.
A study by Pascale Waelti and colleagues shows, impressively, the connection between frontal dopamine activity . According to the classical learning theory, a stimulus is only]1[and learning learned when it is connected with a reward. For example, if stimulus A is connected to a reward consisting of juice, the experimental animal learns to associate stimulus A with the reward juice and consequently shows an increase in licking behaviour. In contrast, the absence of a reward following stimulus B does not increase licking behaviour (Fig.1).
When two further stimuli, X and Y, which are associated with reward, are now presented simultaneously with either stimulus A or B, i.e. AX or BY are presented, one would expect that both stimuli X and Y are associated with the reward in a second learning step. Yet this is not the case: if the animals are shown the stimulus Y alone, licking behaviour follows, in expectation of the reward. However if the animals are shown the stimulus X alone, this behaviour does not follow. Evidently, the previous learning of the association of stimulus A with a reward blocked the acquisition of the association of stimulus X with the reward juice. These observations gave rise to the name of this experimental procedure: blocking paradigm. L. Kamin already introduced this paradigm to the learning theory in the 60s in order to characterise the role of the predictive value of a stimulus for learning.
A à juice
B à no juice
A à licking behaviour
B à no licking behaviour
AX à juice
BY à juice
X à no licking behaviour (blocking)
Y à licking behaviour
Fig. 1 Learning leads to certain behaviours. Stimulus A is associated with licking behaviour, but not stimulus B. If, in a second step, the association of two further stimuli is learned (A with X and B with Y), and AX and BY are both rewarded with juice, the classical learning theory proposes the learning of the association of both X and Y with juice. This however is not the case. The predictive values of the stimuli provide an explanation of this phenomenon: The stimulus X has no predictive value when it was always coupled with the stimulus A, which predicts the reward juice. The previous learning of the association of stimulus A with the reward blocks the learning of the association of X with the reward.
In order to correlate the behaviour of learning organisms with the activity of dopaminergic neurons, Waelti and colleagues used the blocking paradigm in combination with recordings of the activity of dopaminergic neurons. In a behavioural experiment, monkeys first learned to associate a stimulus (A) with a reward (juice), while a second control stimulus (B) was not followed by reward juice. The amount of licking behaviour the monkey exhibited on the source of the juice after the presentation of the visual stimulus indicated the extent of learning of the visual stimulus. After the monkey had learned both stimuli A and B, two further stimuli (X and Y) were presented together with the stimuli A and B. Thus the animal perceived the stimuli A and X together, as well as the stimuli B and Y. In the runs, during which A and X were shown and a reward followed, the stimulus X had no predictive value, because the reward was already indicated by stimulus A and A was therefore associated with the reward. In contrast, the BY runs, which were followed by a reward, triggered a different learning: the animals learned to associate this newly composed stimulus with the reward. Thus the animals always continued their licking behaviour during the AX runs, whereas this licking behaviour was learned during the BY runs.
Now, the decisive test was to present stimuli X and Y alone. It could be shown that presentation of stimulus X alone was not followed by licking behaviour, i.e. stimulus X did not predict a reward with juice for the animal, even though it had been reliably paired with the reward during the presentation of the AX runs. In contrast, the animals showed licking behaviour after the presentation of the visual stimulus Y alone, i.e. the association of the stimulus Y with the juice (as BY) had evidently been learned. Thus it was proven that it is not the connection of a stimulus with reward alone that leads to learning. If this were the case, the connection of both stimulus X and stimulus Y with the reward would have been learned. However it is the predictive value of a stimulus that is responsible for its learning: the stimulus X had no predictive value for the reward and hence was not learned. In contrast, the stimulus Y had a predictive value for the reward and thus was learned.
In order to link this behaviour with neuronal activity, the authors recorded the activity of dopaminergic neurons in parts of the reward system of the brain, i.e. in the substantia nigra and ventral tegmental area, in trained animals. They found that 200 out of 285 dopaminergic neurons were activated by stimulus A and 150 of these neurons were able to distinguish between stimulus A and stimulus B. The rewarded compounds AX and BY activated 94 of 137 dopaminergic neurons, with none of them being activated in only one trial type. When the monkey was shown stimulus X alone, not a single dopaminergic neuron showed an exclusive reaction to the stimulus X. In contrast, 39 out of 137 dopaminergic neurons were active when the stimulus Y was presented alone. Altogether, the dopaminergic neurons clearly showed a stronger reaction to stimulus Y, which predicted the reward, in contrast to stimulus X, which had no predictive value. Thus it could be proven that on a behavioural level, as well as on the level of the activity of dopaminergic neurons, the predictive value of reward for a stimulus is decisive for learning, and not only the mere pairing of a stimulus with a reward.
What is the exact relationship between learning and the activation of the dopamine system? After the connection of stimulus A with a reward and stimulus B with the absence of a reward was learned, the activity of the dopamine system did not increase after the reward in the A runs and at the point of no reward in the B runs. However, if a reward followed unexpectedly after the presentation of a B, the dopaminergic neurons fired. If, unusually, stimulus A was not followed by a reward, the activity of the dopaminergic neurons dropped to under their basal activity. Thus the relationship of a reward and the expectation of a reward with dopaminergic neuron activity is clearly shown: when the reward is better than expected the dopaminergic neurons fire, if the reward is lower than expected their activity decreases.
In addition, the reaction of neurons on the rewarding AX runs was low (stimulus A already indicated the reward). By contrast, the reactivity of the dopaminergic neurons to the rewarding BY runs was initially high since the stimulus Y was new and the stimulus B was not coupled with the reward yet.
The connection of the activity of the dopaminergic neurons with the predictive value of the stimuli could also be shown in runs during which the stimuli X and Y were presented alone without reward: after the presentation of the stimulus X (of which no association with the reward was learned) only one out of 85 dopaminergic neurons showed a decrease in activity, while 30 dopaminergic neurons showed a decrease in activity when stimulus Y was not followed by a reward. On the other hand, the activity of dopaminergic neurons increased when stimulus X was presented alone and a reward followed, while it decreased when stimulus Y was presented without reward.
These findings also played a role in the change of cortical maps, as an investigation by Bao and colleagues was able to show. The authors found that the direct electrical stimulation of dopaminergic nuclei in rats changes the cortical representations of auditory stimuli, which need to be learned. In other words: the cortex was only plastic for corresponding acoustic stimulation, when the dopamine system was simultaneously activated.
 Waelti P., A. Dickinson and W. Schultz (2001), “Dopamine Responses Comply with Basic Assumptions of Formal Learning Theory”, Nature, Vol. 412, No. 6842, 5 July, pp. 43-48.
 Bao S., V.T. Chan and M.M. Merzenich (2001), “Cortical Remodelling Induced By Activity Of Ventral Tegmental Dopamine Neurons”, Nature, Vol. 412, No. 6842, 5 July, pp. 79-83.
(Translation of Spitzer, Manfred (2002), "Besser als Gedacht: Lernen, Dompamin und Neuroplastizität", in Schokolade im Gehirn: und Weitere Geschichten aus der Nervenheilkunde, Schattauer GmbH, Stuttgart, pp. 49-54)