• Keine Ergebnisse gefunden

2. Study 1: Modeling behavioral measures of error detection in choice

2.3. Experiment

To test the derived predictions for the CM and RM accounts we conducted an experiment in which an Eriksen-flanker task corresponding to our model was used.

Participants had to classify a target letter by pressing one of three response buttons with fingers on one hand. The target was either flanked by letters of a different category (incongruent condition) or by neutral symbols (neutral condition). In half of the blocks, the participants where instructed to give an ESR by pressing a neutral key with the hand not used for the main task (ESR condition), whereas in the other half, they should correct their errors (ECR condition)3.

The response criterion was varied by means of a deadline procedure in which the participants were instructed to give their first response before an acoustical signal sounded. The interval from stimulus to signal onset, i.e., the deadline, varied among three levels across blocks. In this way, the participants could anticipate the deadline and adjust their response criterion in an optimal way. An alternative procedure would have been to instruct the participants to emphasize either speed or accuracy. However, such a method might have induced further strategic differences. Yeung et al. (2004), for

instance, speculated that emphasizing accuracy vs. speed could also lead to different degrees of attention.

3 Alternatively, we could also have used spontaneous error corrections as a measure of ECR performance (Fiehler et al., 2005). However, there is evidence that error corrections are actively suppressed if they are not instructed (Rabbitt & Rodgers, 1977a;

Steinhauser & Hübner, 2006a). Accordingly, not all internal corrections would have led to an ECR. It is even possible that the number of inhibited ECRs depends on the deadline. Such inhibitory mechanisms should not be involved in error signaling, because it is reasonable to assume that the production of an ESR is not automatic. Thus, the use of non-instructed ECRs would have confounded the two measures with respect to the presence of inhibitory mechanisms.

Method

Participants. Ten participants (3 female, 7 male) between 19 and 30 years of age (mean 25.2) with normal or corrected-to-normal vision participated in the study. All were right-handed. Participants were recruited at the Universität Konstanz and were paid 5 Euro per hour.

Apparatus. The stimuli were presented on a 21-inch color monitor. An IBM-compatible PC controlled stimulus presentation and response registration.

Stimuli. Stimulus arrays were composed of a target letter and two identical distractor letters, which were on the left and on the right of the target. The letters K, V, M, P, R, and W and the neutral symbols %, &, #, and § were taken from an Arial font and resized on a visual angle of 1.67 ° height and 1.51 ° width at a viewing distance of 127 cm. The whole array subtended a visual angle of 5.10 ° width. Two letters were assigned to one response each. Each letter was used as a target letter and was combined with a distractor letter either from the set of the four letters which required a different response

(incongruent stimulus) or from the set of neutral symbols (neutral stimulus). In this way, 48 stimuli were constructed.

Procedure. Participants were told to respond to the identity of the target and to ignore the flanker letters. Responses were given with the fingers of the right hand. Depending on the letter, a key-press with the index finger was required if the target was either the letter K or V, respectively. A key-press with the middle finger was required if the target letter was either the letter M or P, respectively. Otherwise, a key-press with the ring finger was required.

Each trial started with a stimulus array presented for 150 ms followed by a blank screen.

After a specific interval, an acoustical deadline signa1 (800 Hz) sounded for 150 ms.

Participants were instructed to respond faster than this signal. In half of the blocks, participants were also instructed to correct their errors by pressing the correct key immediately after they had detected the error (error correction response, ECR). In the other half of blocks, they had to signal errors by pressing the space bar of a standard keyboard with their left hand immediately after each error (error signaling response, ESR). 1500 ms after the first response, a new trial started. If further responses (ECR,

ESR) occurred within this interval, a new interval of 1500 ms was started. No feedback on the accuracy of the response was provided. However, on some trials a speed

feedback was given. Whenever the response time exceeded the deadline on five

consecutive trials, the German word “schneller” (engl. faster) was presented for 200 ms on the screen 200 ms after the response.

Each block consisted of 96 trials, two for each possible stimulus. Half of the stimuli were neutral, and the other half were incongruent. Participants worked through 24 test blocks distributed across two test sessions totaling 2304 trials. The ESR and ECR instructions alternated between blocks. The type of the first instruction was

counterbalanced across participants. Furthermore, there were three deadline conditions (low, intermediate, high) which were constant within each block but varied across blocks. The order of the deadline condition was randomized, and four blocks of each deadline occurred in each session.

Each session started with three practice blocks, followed by 12 test blocks. In a

preliminary practice session, 12 practice blocks were performed. In the first four blocks of this practice session no deadline was applied. Rather, they served to determine the three deadlines in subsequent blocks. For each participant, the intermediate deadline was individually set to the median response time in the fourth practice block. The low and high deadlines were obtained by subtracting 50 ms from and adding 50 ms to the intermediate deadline, respectively. These deadlines were used throughout the entire experiment.

Results

For controlling outliers, trials were excluded, whose first response time was two standard deviations above or below the mean (<1%). The remaining trials were

classified with respect to whether the first response was correct or wrong and whether it was followed by an ECR or an ESR. Trials which included more than two responses were excluded with one exception: Errors which were followed by an ECR as well as an ESR were assigned to a separate category. For an overview, Table 1 reports the relative frequencies of ECRs and ESRs within trials with correct and erroneous responses for our two main conditions.

The table reveals a high number of spontaneous error corrections in the ESR condition (35%). Because we do not know how ESR latency is affected by a preceding ECR, the following analyses included only trials from the ESR condition in which no ECR was involved. However, separate analyses revealed that trials with spontaneous ECRs show a rather similar pattern, although the low absolute number of trials with both an ECR and an ESR made a stable estimation of latencies difficult.

In the following, we report analyses of those dependent variables which were used to test the predictions of the model. We start with reporting analyses for each dependent variable separately. In a final section, ECR and ESR performance is compared. Figure 5 depicts each dependent variable as a function of deadline level.

Table 1: Frequencies and latencies for all trial types observed in Experiment 1.

initial response correct initial response wrong

freq.

(in %) RT of consecutive

responses (in ms) freq.

(in %) RT of consecutive responses (in ms) condition ECR

no ECR 99.0 526 13.9 489

ECR 1.0 - / - 85.2 492 / 412

wrong ECR 1.0 - / -

condition ESR

no ESR 98.3 532 12.6 -

no ECR

ESR 0.9 - / - 50.9 510 / 557

no ESR 0.5 - / - 13.5 530 / 168

ECR

ESR 0.3 - / - / - 21.5 502 / 317 / 844

no ESR 0.3 - / -

wrong ECR

ESR 1.3 - / - / -

Note: Frequencies were computed relative to all trials where the initial response was either correct (left column) or an error (right column) within conditions where ECRs (error correction responses) or ESRs (error signaling responses), respectively, were required. For the latencies, the first value represents the latency of the initial response, the second value represents the latency of the ECR or the ESR (when no ECR occurred) computed as the difference between the initial response and the ECR/ESR, and the third value represents the latency of an ESR computed as the difference between the initial response and the ESR. ESRs followed by ECRs are not considered because they were virtually never observed. Minus signs indicate cells where no latency could be calculated because of too few trials or because some participants had empty cells. RT = response time. Freq. = frequency.

Figure 5: Effect of response criterion on the latency of the initial response on correct trials (A), the error rate (B), ECR latency (C), ECR hits and false alarms (D), ESR latency (E), and ESR hits and false alarms (F), in the empirical data, the RM model fit, and the CM model fit.

Whereas for the CM model, the independent and nested fits are shown separately, both fit types were identical for the RM model (for the nested CM model fit, only ESR data are shown, because initial response data and ECR data correspond to those of the RM fit). ECR = error correction response. ESR = error signaling response. RM = response monitoring. CM = conflict monitoring.

Initial responses. To check whether our manipulation of response criterion was successful, we analyzed the response times of correct responses and the overall error rate. Although we focused on the effect of the deadline, we also included the variable stimulus congruency in these analyses. By examining the influence of the flanker letter, we wanted to test whether our participants applied other strategies than a mere criterion shift to adapt to the different deadline levels. For instance, if a long deadline implies that more attention is directed to the target as with a short deadline, we should observe a decreased congruency effect in this condition.

For calculating the mean latency of correct responses, we averaged the latencies of initial responses from trials where the initial response was correct. The data were entered into a three-way ANOVA with repeated measurement on the variables Block Type (ECR-condition, ESR-condition), Deadline (1, 2, 3), and Congruency (neutral,

incongruent). The analysis revealed significant main effects of all variables. Mean response time was increased in the ESR-condition (521 ms) relative to the ECR-condition (513 ms), F(1, 9) = 21.3, p < 0.01. It was increased with incongruent stimuli (523 ms) relative to neutral stimuli (510 ms), F(1, 9) = 60.9, p < 0.001. Finally, it increased linearly with an increasing deadline level (1: 487 ms, 2: 516 ms, and 3: 547 ms), F(2, 18) = 137.7, p < 0.001. No significant interaction was obtained.

The error rate denotes the relative frequency of erroneous initial responses. The data were subjected to the same type of analysis as above. The analysis indicated significant main effects of Deadline, F(2, 18) = 31.5, p < 0.001, and Congruency, F(1, 9) = 88.1, p

< 0.001, which represent the fact that the error rate decreased linearly with an increasing deadline level (1: 27.6%, 2: 19.0%, 3: 13.6%) and was higher on incongruent trials (23.3%) than on neutral trials (16.8%). However, these effects were qualified by a significant three-way interaction involving all factors, F(2, 18) = 7.28, p < 0.01. This can be attributed to the fact that the performance difference between neutral and incongruent stimuli fluctuates in a non-systematic manner between the different

deadline conditions of the ESR-blocks (1: 4.7%; 2: 9.3%; 3: 5.7%) and the ECR-blocks (1: 8.5%; 2: 5.4%; 3: 5.3%).

ECR and ESR performance. ECRs and ESRs were analyzed in a similar way. For each measure, three dependent variables were calculated, and were entered into a one-way ANOVA with repeated measurement on the variable Deadline (1, 2, 3). The latency was calculated as the time elapsed between the erroneous response and the respective

detection response, including only trials where errors were successfully corrected or signaled, respectively. The hit rate was calculated as the relative frequency of successfully corrected or signaled errors, respectively, relative to the rate of all trials where the initial response was an error. Finally, the false alarm rate was calculated as the relative frequency of erroneously corrected or signaled correct responses, relative to the rate of all trials with a correct initial response.

The analysis of the ECR latency revealed a significant effect of Deadline, F(2, 18) = 14.6, p < 0.001. The mean correction time increased with an increasing deadline level (1: 371 ms, 2: 415 ms, 3: 479 ms). The ECR hit rate showed a marginally significant effect of Deadline, F(2, 18) = 3.38, p = 0.057. The hit rate was similar on the first two deadline levels, but decreased on the third level (1: 86.8%, 2: 87.1%, 3: 82.6%). Finally,

the analysis of the ECR false alarm rate revealed a non-significant trend toward an increased false alarm rate on the lowest deadline level (1: 2.1%, 2: 0.8%, 3: 0.4%).

The analysis of the ESR latency showed that Deadline had a significant effect on this variable, F(2, 18) = 10.2, p < 0.001. The detection time increased linearly with an increasing deadline level (1: 513 ms, 2: 576 ms, 3: 641 ms). The analysis of the ESR hit rate showed no significant effect of Deadline. However, there was a non-significant trend toward a reduction of the detection hit rate on higher deadline levels (1: 53.9%, 2:

48.2%, 3: 46.9%). Finally, the analysis of the ESR false alarm rate revealed a significant effect of Deadline, F(2, 18) = 3.78, p < 0.05. False alarms were more frequent on lower deadline levels (1: 1.5%, 2: 0.9%, 3: 0.4%).

Comparison of ECR and ESR. To compare ECRs and ESRs, both were entered into the same analyses. We computed two-way ANOVAs with repeated measurement on the variables Deadline (1, 2, 3) and Measure (ESR, ECR). Only values involving the variable Measure are reported. For the latencies, the main effects of Deadline, F(2, 18)

= 26.2, p < 0.001, and Measure, F(1, 9) = 76.2, p < 0.001, were significant. ESR latency (577 ms) was increased relative to ECR latency (422 ms). The interaction was not significant. For the hit rates, only the main effect of Measure reached significance, F(1, 9) = 118.2, p < 0.001. The hit rate for ESRs (49.7%) was reduced as compared to that for ECRs (85.5%). Again, no significant interaction was detected. For the false alarm rates, no significant effect was obtained.

Discussion

In the present experiment the response criterion was manipulated by varying a response deadline between blocks. We hoped that the participants used an individual criterion for each deadline. The analyses of the latencies and error rates for the initial responses confirmed that the manipulation was successful. Response times of correct responses increased linearly with the deadline, whereas the error rates decreased. Moreover, there is no evidence that our participants adopted different attentional strategies for the different deadline levels. This can be concluded because the distractors had a similar effect for the different deadline levels, at least for the response times. Taken together, it seems that the deadline effects in our paradigm were due to a shift in the response criterion.

ECR and ESR performance was measured in two blocked conditions in which participants were either instructed to correct their errors or to signal their errors. A preliminary analysis revealed that a substantial number of errors were spontaneously corrected in the ESR condition although this was not instructed. Possibly, the tendency to spontaneously correct errors is facilitated when an ESR is required (e.g., Steinhauser

& Hübner, 2006b). Interestingly, more than one third of these spontaneously corrected errors did not lead to an ESR although this was instructed. However, the correction latency for these un-signaled errors was very short (168 ms). Accordingly, one could hypothesize that the participants were unsure whether they should signal an error, when this error was almost immediately corrected.

Most important for the present objective are the deadline effects on ECR and ESR performance. Based on our simulations, we derived different predictions for the CM account and the RM account. According to the RM account, we should have observed similar effects of the deadline on ECR and ESR performance. In contrast, the CM model predicts different effects, at least for the latencies. According to this model, the deadline effect on the ESR latency should be weaker than that on the ECR latency. Our data clearly support the RM account. ESR and ECR performance was similar, not only for the latencies but also for the hit rates and the false alarm rates. We observed only a difference with respect to the absolute latency and hit rate. ESRs required more time and were slower than ECRs, which replicates the result of earlier studies (Rabbitt, 2002). This, however, is also consistent with an RM account, if we assume that this difference is due to an increased failure probability and duration of selecting the signaling response.

There are also other interesting results in our data. First, the deadline effect on the ECR latencies was stronger than that on the latency of the initial response. Second, the ECR and ESR hit rate slightly decreased with an increasing deadline. Finally, the false alarm rates were rather low. Although these findings cannot distinguish between the CM and RM model, they are consistent with the results of our simulations.

The fact that our experiment confirmed the predictions of the RM model in nearly every detail shows its high validity. In contrast, the CM model did not predict the similar criterion effects on ECR and ESR latency. However, we varied only a few parameters in our simulations (see Appendix B). Most of them were fixed to values used by Yeung et

al. (2004) to simulate Ne/ERN data. This raises the question of whether other parameter values would have also allowed the CM model to predict our empirical results.

To see whether this is the case, we fitted the models to our data by an exhaustive parameter search. Usually, this procedure is difficult for connectionist models, because there are two problems. First, model performance is strongly influenced by noise, which reduces the efficiency of search algorithms. Second, the number of parameters is often very large in connectionist models, which requires a high computational effort to search parameter space. Fortunately, Bogacz and Cohen (2004) introduced a search procedure that deals with the problems inherent in neural networks. For keeping the computational effort low, we optimized only those parameters which we considered to be relevant for the present purpose.

2.4. Model Fit

Each model was fitted to the data of eight empirical variables: The latencies, hit rates and false alarm rates of ECRs and ESRs, and the latencies of correct responses and the overall error rate. Table 2 gives an overview of the parameters which were fitted for each model. Ten parameters were the same for the RM model and the CM model: the three response criteria (Clow, Cmed, Chigh), the time after which stimulus processing was stopped following the first response (dstop), the time constants corresponding to the duration of non-decisional processes of the initial response, ECR and ESR (TND1, TND2, TND3), the time per cycle (Tcycle), and two scaling factors which weighted the strength of inhibitory (sci) and excitatory connection weights (sce). For the RM model, the

additional parameter P(ESR fails) was used, which specifies how frequent the selection of the ESR fails, despite the occurrence of an internal correction response. For the CM model, the detection delay D and the detection threshold K was additionally optimized.

All other parameters (e.g., connection weights) were fixed to the values used by Yeung et al. (2004). Please note that the response criterion C was the only parameter that was allowed to vary between the criterion conditions. All other parameters were held constant across these conditions.

We applied the search procedure introduced by Bogacz and Cohen (2004) which they explicitly developed for neural networks like the present one. The algorithm proceeds in

three phases: A starting parameter search, an optimization phase, and a tuning phase.

Each of these phases consists of a fixed number of iterations. We set the number of iterations for each phase to 400, 200, and 100, respectively. In each iteration, the respective model was calculated 5000 times for each stimulus type (incongruent, neutral) within each of the three response criterion conditions. Again, the results were averaged across the two stimulus types. The whole algorithm was applied ten times for each stage of fitting.

For estimating the quality of the fits we calculated as goodness-of-fit statistics the mean

For estimating the quality of the fits we calculated as goodness-of-fit statistics the mean