• Keine Ergebnisse gefunden

5.3. In-Beam Tests

5.3.1. The Experiment Setups

The efficiency of applied radiation mitigation techniques, that were implemented in the course of this thesis, was evaluated through two in-beam tests at Cooler Synchrotron (COSY) in Jülich/Germany, one in 2012 and one in 2013. The particle accelerator pro-vided protons at∼2GeVand a maximum beam flux in the order of 107s1cm2in both tests.

Both setups consisted of the device under test that was mounted in the beam line and a support board was mounted out of the beam line. Data of 28 GET4s was generated on the support board, emulating the GET4 message protocol. The generated data was deterministic and therefore could easily be analyzed for corruption after processing on the device under test. Using real GET4 ASICs as data source would have been possible but it would also have unnecessarily increased the complexity of the setup and make data analysis more difficult.

The firmware implementing the read-out and radiation mitigation techniques was run-ning on a SysCore version 2 board. The on-board configuration controller was config-urable to either execute blind scrubbing or to remain idle.

The test procedure illustrated in figure 5.9 is specially designed to evaluate the effi-ciency of scrubbing. During the execution of the test procedure scrubbing was either enabled, which means continuously refreshing the configuration memory of the FPGA, or scrubbing was disabled, then SEUs can accumulate over time. The test procedure is divided into the following steps:

• In the first step (Logfile Header) important information is recorded and stored in the header of the logfile to help for subsequent analysis and archiving of the data.

• Step Init Readback is required to calibrate the board counting SEUs. The SEU Counter board is explained later in section 5.3.3 in more detail.

4It is not uncommon to share beam time among several experimenters. This becomes problematic when the setup of one group cannot sustain high particle rates but another group requires high particle rates.

Set Testreg.

Record Data (optional)

Check Testreg.

Test DUT

Test DUT again Reprogram

Logfile Header Log some parameter of this run:

- scrubbing on/off - with/without redundancy - data taking on/off

- comment (set via command line argument)

Init Readback Readback reference measurement

for 3 minutes comparing SEU rate of both devices in beam.

Set 128 32bit wide test registers, either with or without redundancy.

Optional! Record 3 seconds of data.

Check the test registers for errors.

DUT not ok DUT ok

DUT ok

DUT not ok

Run above test script again to see if error is temporary or persistent.

Fully reprogram everything and continue.

Start

SEU counting Readback the configuration of the reference board and check for SEUs.

Test if device under test streams valid data.

Figure 5.9.: Illustration of the test procedure performed during the 2012 in-beam test to evaluate the efficiency of the applied radiation mitigation techniques. If enabled, scrubbing is running continuously in the background during all steps exceptLogfile HeaderandInit Readback. The key aspect is to test twice for correct operation which gives scrubbing time to repair the device. Without data taking, a regular loop lasts about 8 seconds. IfTest DUT againis reached and the device did recover, it lasts about 12 seconds. A cycle with full reprogramming takes 18 seconds. The same procedure was also used in 2013 but without stepsSet TestregandCheck Testreg.

The procedure then enters the main loop. During the runs with scrubbing enabled, the scrubbing engine is turned on at this point and continuously refreshes the FPGA configuration memory in the background.

• Set Testregand Check Testregare only included in the 2012 test. Here, a set of registers is tested for errors. Two firmware versions were created, one implement-ing TMR for these registers and one without redundancy.

• InRecord Datadata is recorded to hard disk for three seconds (∼15MB) for subse-quent offline analysis. This step is optional to not unnecessarily fill the hard disk, e.g. during debugging or at reference runs without beam. The timescale of three seconds is comparable to the time to recovery one can expect when board failures are detected and reconfigured from an entity outside the radiation zone.

• InSEU countingthe current number of accumulated SEUs is recorded.

• The key idea of the algorithm is to check twice for an operational device, inTest DUT and in Test DUT again. The functional status is determined based on the online analysis of 2 000 data samples. If not all data samples are valid the first time, the test is repeated to allow for scrubbing to repair the device. A single test takes much longer (2s) than a full scrubbing cycle (80ms).

• If the second test fails, the complete setup is fully reset (Reprogram).

The key aspect is to repeatTest DUTin case of an error, this gives scrubbing the oppor-tunity to repair the device in the meantime. Only in case of two consecutive errors, the device is considered to be “permanently” corrupted. If the first test fails but the second test runs flawlessly, the error is counted as “temporary”.

It should be noted that the test procedure is mainly designed for the evaluation of the effect of scrubbing on the recorded data. The decisive tests,Test DUTandTest DUT again, operate on data only and leave aside any test of control register validity. For that reason, the positive effect of Selective TMRis not directly visible in the results (see section 6.3).

With the stepsSet TestregandCheck Testregthe redundancy effect is measured nev-ertheless, but the exploited test registers are specially added for the sole purpose of this test and are not part of the logic of the original GET4 read-out firmware.

Test Setup 2012 The experiment took place from the 6thto the 9thof August in 2012.

For the reasons described in section 4.3.1, only the GET4 module was installed in the beam line. The Optics module was operated on a supporting board, the same board on which also the GET4-data generator was running (see also figure 4.5).

The main tests for a functional device (Test DUTandTest DUT again) are based on data consistency. WithSelective TMR, however, the data path is not protected by TMR. A com-parison of a firmware implementing Selective TMR to a plain firmware without redun-dancy will not yield meaningful results. To nevertheless measure the redunredun-dancy effect,

128 registers (32 bit wide) were integrated in the DUT firmware. Two DUT firmwares were synthesized, one with TMR’ed test registers and one with non-redundant test reg-isters.

The main intention for the test, however, is the offline analysis of the recorded data.

Test Setup 2013 The experiment took place from the 2ndto the 4thof July in 2013.

This time the full GET4 read-out controller firmware was exposed to the particle beam.

The supporting board was only required to execute the logic for deterministic generation of GET4 data.

The registers for testing redundancy are not implemented for the 2013 firmware. As they were implemented as part of the tunnel modules it would have been extra effort to integrate them into the 2013 firmware and the only gain would be a replay of the 2012 test.

The main goal of the in-beam test is to operate a complete read-out controller firmware in a very high radiation environment and not to repeat the TMR test of 2012. Therefore, these registers are omitted in the 2013 test.