Detector Dead Time Estimations Based on Parameters Measured at In-Beam Tests

Based on these numbers the implications on electronics at the ToF wall can be estimated, here for the inner region were the radiation level is highest.

Expected SEU Rate at CBM-ToF First the expected SEU rate at CBM-ToF is calculated, it is given by the following relation:

SEU rate = device cross section · fast hadron flux (7.1) SEU rate = ^zNo. of configuration bits^}| ·bit cross section^{ ·fast hadron flux (7.2) The required parameters are all available:

• Number of configuration bits: With a current GET4 read-out implementation for a Spartan-6 LX150T FPGA that is supporting 57 GET4 ASICs, the “number of occu-pied slices” is reported to be 36% of all available slices. To reduce cost, the size of the final ToF-ROC’s FPGA will most likely be chosen such that not much fabric resources remain unused, 80% occupied slices should be a realistic value.

A Spartan-6 LX150T firmware used to “full capacity” (80 %) would therefore sup-port: 57 GET4s·⁸⁰/36 ≈ 126 GET4s. The firmware for the Spartan-6 is about 4MB or 3.2 10⁷bit, this translates to 2.5 10⁵bitper GET4.

The ToF-TDR describes a detector layout with 26 592 GET4s connected to 336 ROCs [TOF, p. 59], i.e. about 80 GET4s per ROC.

The number of configuration bits in the proposed design is therefore 2·10⁷bitper ROC and 6.8·₁₀⁹bit≈₇·₁₀⁹bitfor the whole experiment.

• Bit cross section: The bit cross section of Xilinx Series 7 devices is published in [Xil14b, page 27]. The numbers are 6.99·₁₀⁻¹⁵cm² (configuration memory) and 6.32·10⁻¹⁵cm²(BRAM memory), it should be safe to assume 7·10⁻¹⁵cm²for both.

• Fast hadron flux:FLUKA simulations prefigure a fast hadron flux at the inner region of the ToF wall of about 10⁴cm⁻²s⁻¹[Sen11].

With these numbers, the expected SEU rate at CBM-ToF can now be calculated according to equation 7.2.

SEU rate(per ROC) = 2·10⁷ · 7·10⁻¹⁵cm² ·10⁴s⁻¹cm⁻² = 0.0014s⁻¹ (7.3)

=⇒ every 12 minutes in a single ROC

SEU rate(_{whole ToF}) = 7·10⁹ · 7·10⁻¹⁵cm² ·10⁴s⁻¹cm⁻² = 0.49s⁻¹ (7.4)

=⇒ every 2 seconds in one of the ROCs of the detector Expected Error Rate at CBM-ToF Not every SEU has an effect on the running hard-ware. With the results from the in-beam tests we can now estimate the error rate.

It should be noted that such factors can also be achieved with fault injection test (see e.g. section 5.3.2). However, the results of fault injection tests only cover SEUs in the static part of the configuration memory (PSMs, LUTs) but not in the dynamic part (FFs, Memory). Therefore, in-beam tests are mandatory to analyze the full spectrum of SEU effects. This is especially true for commercial off-the-shelf electronics where the internals are not known in full detail.

The 2012 in-beam test has better statistics since the test lasted longer and the particle rate was higher. However, the problem with the 2012 test is that only a stripped down version of the firmware was exposed to beam particles. For that reason the following considerations are based on the numbers from the 2013 in-beam test despite its fewer statistics. After all, the 2013 measurements are consistent with the results of 2012.

First we consider the case without scrubbing. We can see from figure 6.9, that with-out scrubbing, the average number of test procedure iterations until a system error is

1/14.8% = ¹_/0.148 ≈ 6.8. One iteration takes about 7.5 seconds, the average lifetime can therefore be estimated to 51 seconds. In 51 seconds and with 0.61 SEUs/s, 31 SEUs are collected in the SEU Counter board. As described in section 6.3.3, the SEU rate in the DUT was only 70% of the SEU rate in the SEU Counter board. This means that in 51 seconds 21 SEUs have accumulated in the DUT.

Without scrubbing, the expected error rate at CBM-ToF is therefore:

error rate⁽no scrubbing)

(per ROC) = SEU rate₍_{per ROC}₎/21 = 6.7·10⁻⁵s⁻¹ (7.5)

=⇒ every 4 hours in a single ROC error rate⁽no scrubbing)

(whole ToF) = SEU rate₍_{whole ToF}₎/21 = 0.023s⁻¹ (7.6)

=⇒ every 43 seconds in one of the ROCs of the detector When scrubbing is implemented, two error rates have to be distinguished, temporary errors and permanent errors. Note that in case of scrubbing enabled, data from figure 6.9 cannot be used to determine the rate of temporary errors. Figure 6.9 does not show

those errors that have already been fixed silently by scrubbing during early steps of the test procedure beforeTest DUTis reached. However, it can safely be assumed that with scrubbing enabled, temporary errors occur at the same probability as the error rate that is calculated above for the case without scrubbing.

temporary-error rate⁽₍^scrubbing_{per ROC}₎⁾ = SEU rate₍_{per ROC}₎/21 = 6.7·10⁻⁵s⁻¹ (7.7)

=⇒ every 4 hours in a single ROC

temporary-error rate⁽₍^scrubbing_{whole ToF}⁾₎ = SEU rate₍_{whole ToF}₎/21 = 0.023s⁻¹ (7.8)

=⇒ every 43 seconds in one of the ROCs of the detector Permanent errors, on the contrary,canbe deduced from figure 6.9 as they are not re-moved by scrubbing. With 0.139 % of the iterations showing permanent errors, the aver-age number of of iterations until a permanent error occurs is about 720. This means the DUT sustained an average of about 2300 SEUs until it failed. When scrubbing is enabled, the expected rate of permanent errors at CBM-ToF is therefore:

permanent-error rate⁽₍^scrubbing_{per ROC}₎⁾ = SEU rate(per ROC)/2300 = 6.1·10⁻⁷s⁻¹ (7.9)

=⇒ every 19 days in a single ROC

permanent-error rate^scrubbing₍_{whole ToF}⁾₎ = _{SEU rate}₍_{whole ToF}₎_/2300 = _2.1·10⁻⁴s⁻¹ (7.10)

=⇒ every 80 minutes in one of the ROCs of the detector Unfortunately, the factor of 2300 is based on only one single event in the 2013 in-beam test, when a permanent error was measured with scrubbing enabled. The uncertainty of the factor is accordingly high.

Expected Radiation Induced Dead Time of CBM-ToF Electronics To determine the fraction of time in which an error is disturbing the setup one needs to divide the time required to repair an error by the average time between the occurrence of two errors. A temporary error will exist only for a short time (in the order of 100ms) until it is cor-rected byscrubbingwhereas a permanent error persists until the device is reset externally (probably for a few seconds if an intelligent error detection mechanism is implemented).

dead time (without scrubbing) = 3 seconds every 43 seconds

=⇒ in total: 7 %

dead time (with scrubbing) = 100 milliseconds every 43 seconds and 3 seconds every 80 minutes

=⇒ in total: 0.3 %

These values are based on the assumption, that the whole detector is down when a single error occurs. This is not the case as an error affects only one out of 336 boards

(see point 4 in section 7.1). For physics cases that do not require a complete picture of the detector these values can be further reduced by a factor of 336. Dead time without scrubbing is then∼ 0.02 % and dead time with scrubbing∼ 0.0009 %.

Expected Radiation Induced Data Corruption at CBM-ToF The expected data quality for CBM-ToF would be very much the same as shown in figure 6.10 because points 1 and 2 in section 7.1 counterbalance each other.

corrupted data (without scrubbing) = 3−4 % corrupted data (with scrubbing) = 0.1 %

As before, corrupted data from a single ROC does not necessarily render the data from the other ROCs useless. Therefore, depending on the physics case, the expected percent-age of corrupted data has to be further reduced by a factor of 336 (the number of installed ROCs). This results in about 0.01 % of the data to be corrupted without scrubbing, and only about 0.0003 % with scrubbing.

Impact on CBM-ToF Strategy The conceptual design for CBM-ToF consists of six dif-ferent modules, organized in an “inner wall” (4.3m×3m), modules M1 to M3) and an

“outer wall” (rest of the 12m×9mwall, modules M4 to M6) [DHA⁺14]. The results look promising and suggest to put FPGAs close to the inner region. The ToF-TDR foresees SRAM-based read-out electronics directly on the modules of the “outer wall”. For the read-out of the “inner wall”, however, SRAM-based electronic will be placed outside the area of the “inner wall” [TOF, p. 18]. The results of this thesis already provided impor-tant input to this decision. The ToF-TDR [TOF] references [Mül14] in this context, and the information from [Mül14] is based on assessments from the present work. Using FPGAs in such an harsh radiation environment would possibly have been rejected without this research.

As a backup strategy, the radiation level can be lowered further by a factor of 10 when moving the electronics one or two meters farther away [Sen11]. Of course, this would come with the drawback of higher cost due to more cabling. However, given the results of the present work, it seems unlikely that the backup strategy is required. One might rather consider the usage of FPGAs in zones with even higher radiation levels instead.

This chapter summarizes the achievements of this thesis and gives an outlook for future work that remains to be done before an FPGA based CBM-ToF read-out chain can be put into service.

Im Dokument Radiation mitigation for SRAM-Based FPGAs in the CBM experiment (Seite 116-121)