• Keine Ergebnisse gefunden

The use of a dual rail encoding within the exclusive OR trees enables a standard cell implementation that only employs multiplexers. The presentedTransmission-Gate Exclusive OR depicts an area efficient realization containing four transmission gates.

The experimental evaluation quantifies a significant cell area reduction by 50 %. As a side effect, the cell is substantially faster by more than 3X and almost bisects the average power consumption and energy.

The cell’s application to multiple building blocks of the previously discussed fault tolerance architectures reconfirms the targeted area efficiency increase and lowers the area overhead by at least 50 %. For registers with 63 or more bits, the bitwise implementation of Dual Modular Redundancy is even more expensive than Error Correction based on the logarithmic characteristic.

7 15 31 63 127 255

0 50 100 150 200 250 300 350 400

154.3

95.34

66.13

50.48

40.94 35.92 121.43

77.99

56.78

45.4

38.19 34.43 317.13

232.01

199.03

174.92

156.77 153.29 287.16

215.99

190.32

170.16

154.17 151.88

Register Size [bit]

AreaOverheadtoOriginal[+%]

SEC-TG SED-TG

SECDED-TG DED-TG

SEC SED

SECDED DED

TMR DWC

Figure 7.8.: Area Overhead - Area Efficient Architectures (SED TG, DED TG, SEC TG, SECDED TG) - Single Register (data from Table A.11 and Table A.12).

Summary and Discussion of Part II

This part targeted soft errors in the sequential state of a circuit’s random logic by detection, localization and correction implemented with different localization granu-larities and correction capabilities. In favor of a bitwise redundancy implementation, error detecting and correcting codes are used across all presented fault tolerance architectures which will be recapitulated in the following.

Chapter 4-Non-Concurrent Detection and Localization of Single Event Upsets- focuses on circuits equipped with clock gating to diminish the power consumption during idle phases. During these clock gated phases, the sequential state of a module is retained over long periods of time. Thus, assuring the data correctness upon leaving the gated phase is a necessity. The presented two-tiered architecture combines an efficient error detection based on individual register parities with a module-wide localization founded on a logarithmic checksum of register parities. TheParity Pair Latchefficiently implements the detection by merging the register latches with the first level of the parity tree into a new standard cell. Compared to the reference implementation, it bisects the area overhead and significantly accelerates the parity computation while reducing the power consumption and energy considerably. The Modulo-2 Address Characteristic of register parities is used for module-wide localization and inherits a low gate and connection count by using the optimal characteristic tree organization. Overall, for modules with registers containing 16 or more bits, the area overhead associated with detection and localization is reduced from over +90 % to below +20 %.

Chapter 5- Concurrent Online Correction of Single Event Upsets- targets single errors affecting the sequential state during the operational phase of a circuit.Single Error Detection is achieved by implementing themodulo-2 address characteristicfor indi-vidual registers to derive a register specific error condition. The protected storage of the error condition eliminates false detections that may result from soft errors directly affecting the architecture while the register content is correct. Thus, all single errors are detected and localized and can be corrected by recomputation. Compared

to bitwise Dual Modular Redundancy, the use of a logarithmic error detecting and correcting code results in a lowered area overhead which is almost bisected for larger registers. Single Error Correction within one clock cycle is achieved by exploiting the computed localization information. It is used to control the efficient low level correction provided by the Bit-Flipping Latchstandard cell which augments a latch with the ability to invert its stored value. The Bit-Flipping Latch requires only +20 % area in addition to a latch, has no negative impact on timing behavior of the data path and reduces the average power consumption and energy by 25 %. The self-contained online correction architecture has a time vulnerability factor, i.e. the unprotected time interval of a register, of zero. Compared to bitwise Triple Modular Redundancy, the area overhead is reduced by more than one third for larger registers.

Chapter 6-Fault Tolerance in Presence of Multiple Bit Upsets- contemplates the behav-ior of the online architecture under Multiple Bit Upsets. While double errors affecting the register are correctly detected, their localization and thus correction is not possible due to the limited minimum Hamming distance of the used characteristic. However, the introduction of an additional register parity bit allows to distinguish correctable single errors from double errors and completely avoids false corrections. The resulting Extended Characteristic C+is computed with negligible hardware overhead by reusing intermediate results of the optimal characteristic tree implementation. Thus,Single and Double Error Detectionas well asSingle Error Correction Double Error Detection are facilitated at a lower area overhead than bitwise modular redundancy for register with 15 or more bits while retaining all other architecture properties.

Chapter 7-Area Efficient Characteristic Computation- analyzes the architecture’s area overhead in more detail and identifies the characteristic computation as a major area overhead contributor. Due to the already optimal characteristic tree organization, a further area reduction can only be expected through an optimized Exclusive OR standard cell. The presentedTransmission-Gate Exclusive OR is smaller by 50 %, faster by three times and has a lower power consumption and energy than the reference Exclusive OR. Its application to the characteristic tree as well as other building blocks of the presented fault tolerance architectures lowers their previously reported area overheads by at least 50 % in all cases. Thereby, for register with 64 or more bits, fault tolerance based on the presented error correction utilizing a logarithmic checksum becomes even more favorable in terms of hardware overhead than sole detection achieved through the bitwise implementation of Double Modular Redundancy.

Part III

Infrastructure Reuse

for Offline Testing

Chapter 8

Test Access through Infrastructure Reuse

The emerging need for fault tolerance was targeted in the last part by the introduction of a self-contained infrastructure able to correct Single Bit Upsets in the sequential cir-cuit state. Typically, a design is augmented by different types of infrastructure serving orthogonal objectives. The most widely used field that utilizes on-chip infrastructure istest, an experiment to show the presence of hard faults, which is performed at least once for every produced chip during manufacturing test. Testing a circuit involves the abstraction from defects to faults within a fault model and the generation of a circuit specific test set that covers a maximized fraction of all possible fault locations.

As faults located at internal nodes may exhibit a low accessibility, usually additional infrastructure is employed that serves as aTest Access Mechanism (TAM) in order to increase testability and reduce test application time at the cost of additional area. The traditional use of two distinct infrastructures to conquer soft errors during operation while orthogonally providing test access raises the chip area that has to be allotted to infrastructure. Collaterally, both infrastructures are never used concurrently due to the contradicting goals pursued by fault tolerance, that mitigates the effect of soft errors, and test, that shows the presence of hard faults.

This chapter is based on the Bit-Flipping Scan architecture from [IW14], a unified infrastructure with low hardware overhead that procures fault tolerance by online correction and supports offline test by serving as an efficient test access mechanism.

The remainder of this chapter starts by depicting the unified architecture and its exten-sions in excess of the online fault tolerance from the previous part. Subsequently, test application under infrastructure reuse is detailed along with the modes of operation that provide test access to the sequential circuit state in Sections 8.2 to 8.4. Finally, the reachable test access efficiency is discussed theoretically before concluding with a short summary.