• Keine Ergebnisse gefunden

Implications for the Dual-Process Account

II. STUDIES

III.2 Implication of the results for existing research

III.2.1 Implications for the Dual-Process Account

As outlined before, the results of several studies suggest that under certain

circumstances, controlled feedback learning processes, and not automatic reinforcement learning processes, are central for adaptive behavior (Chase et al., 2011), or that they can bias automatic processes (Doll et al., 2011; Doll et al., 2009). The present findings contribute to this literature as they provide evidence that the feedback-P300 – which appears to be strongly related to attention and working memory (Donchin & Coles, 1988; Johnson Jr., 1988a; Polich, 2007), was predictive of successful learning from errors (Study 1) and was affected by

irrelevant invalid information that also was detrimental to test performance (Study 2). In

contrast, we did not find any correlation between test performance and the FRN amplitude or the FRN effect magnitude. Furthermore, there was no reliable FRN effect in Study 2. These results appear to contradict several findings that indicate a connection between the FRN effect and adaptive behavior (Butterfield & Mangels, 2003; Frank et al., 2005; Hewig et al., 2010;

Holroyd & Krigolson, 2007; Santesso et al., 2009; Van Der Helden et al., 2010), but see P. Li et al., 2010; Mies, van der Veen, Tulen, Birkenhäger, et al., 2011; Philiastides et al., 2010).

The most likely explanation for the differing findings can be derived from the fact that most of these studies focused on reinforcement learning and used paradigms in which underlying rules were too complicated to be represented in working memory (e.g., a sequence in a sequence learning task, Van Der Helden et al., 2010), contingencies could only be extracted over a considerable number of trials (Santesso et al., 2009), or explicit memory of

immediately preceding outcomes (i.e., working memory content) could not be used reliably for adaptive behavior (like in a time estimation task, probabilistic learning task, or while playing Black Jack). In contrast and as intended, performance in the multiple-choice learning paradigm relied strongly on the explicit encoding of feedback and its later recall. This

conclusion is supported not only by prior research (Bangert-Drowns et al., 1991; Butterfield

& Mangels, 2003; Mangels et al., 2012) but also by the obtained ERP results. These suggest that effective feedback evaluation is most likely impaired by factors that negatively affected attention and working memory updating during feedback presentation as indicated by an attenuated feedback-P300 amplitude: In Study 2, the presence of irrelevant invalid feedback information was associated with a reduced feedback-P300 and impaired test performance. In Study 1 a more pronounced feedback-P300 was associated with successful learning from errors, indicating that controlled feedback evaluation might have been impaired in trials that did not result in successful error correction.

While this underlines that controlled processes can be of central importance for learning and decision-making, the results of Study 2 and 3 also provide some insight into the effect of top-down processes on automatic feedback processing. Although the irrelevant feedback stimulus contained potential feedback information, there was neither an FRN effect when it was presented prior to relevant feedback, nor did it affect the FRN (or N2) after relevant feedback. It follows that the reinforcement learning processes underlying the FRN effect are not entirely ‘automatic’ but subject to prior information about the validity of an apparent feedback stimulus. Study 3 showed that prior feedback reliability information does indeed

moderate the FRN effect; it was only observed when feedback was indicated to be more likely to be valid. Further analysis showed that this discriminating effect did not manifest itself or increase over the course of an experimental session, but was already clearly present in the first half of the session. This suggests that top-down processes and not low-level learning caused this finding.

Here, a parallel can be drawn to the augmented BG-DA model (Frank & Claus, 2006) and associated research. Frank and Claus (2006) suggested that working memory-based processes can bias feedback processing in the basal ganglia, and recent findings indicate that an explicit, albeit erroneous rule can influence choice behavior in a reinforcement learning task (e.g., Baron, Kaufman, & Stauber, 1969; Doll et al., 2009; Kaufman, Baron, & Kopp, 1966; Nosofsky, Clark, & Shin, 1989), and even more so in people with better working memory performance (Doll et al., 2011). Doll et al. (2011) argued that this is due to a top-down bias on the reinforcement learning system. Indeed, using fMRI, Li and colleagues (2011) reported that feedback-related activity in the nucleus accumbens and the ventromedial prefrontal cortex decreased when an instruction was given. However, ERP results in a study by Walsh and Anderson (2011) revealed that only the feedback-P300, but not the FRN, was affected by an instruction (see also Sailer et al., 2010). Despite the fact that the response behavior suggested by the instruction was immediately adopted, reinforcement learning continued unabatedly as indicated by a significant FRN effect. Moreover this FRN effect also corresponded to the reward prediction error on a trial-by-trial basis. It appears that although participants considered feedback to be less important in the instruction condition, as indicated by the reduced feedback-P300, the reinforcement learning system still processed the feedback stimulus normally.

Apparently, the results of Studies 2 and 3 are at odds with these findings. However, the consideration of important differences between these studies can resolve this conflict between results. On the one hand, under the circumstances found in Walsh and Anderson’s (2011) study, it is adaptive that reinforcement learning system continues to process feedback regardless of the presence of the ostensibly correct instruction in working memory. The information gained over time through automatic feedback processing can either support the assumptions held in working memory or call these in question and indicate that they were either wrong (Doll et al., 2009) or that contingencies have changed (Chase et al., 2011). By this, dual processing allows for more adaptive decision making. On the other hand, while

processing of this seemingly uninformative feedback is at least not harmful to performance, processing of invalid feedback is very likely to be detrimental, as suggested by the behavioral results of Studies 2 and 3. When automatic feedback processing is likely to add false and thus interfering information to feedback evaluation, it should be suppressed by top-down

processes. Thus, these studies together suggest that while prior information always affects controlled feedback processing (as indicated by feedback-P300 results20, see Sailer et al., 2010; Walsh & Anderson, 2011), whereas whether the reinforcement learning system is allowed to process feedback without top-down interference depends on whether information available indicating that uninfluenced reinforcement learning might be harmful. However, as suggested by Mies, Van der Veen, Tulen, Hengeveld, et al. (2011), this additional information can only affect the FRN when sufficient processing time between its presentation and

presentation of the feedback stimulus is available. Together these studies draw a more detailed picture of the interaction of both feedback processing systems.