Datapath Separation - 4 Automated Functional Approximation of Sequential Circuits

4 Automated Functional Approximation of Sequential Circuits

4.1 Datapath Separation

The first important prerequisite before starting the approximation in the presented methodology is the identification of elements that are qualified for approximation. It

4.1 Datapath Separation has already been explained in the previous chapter how imprecisions in some elements do only affect the quality of a circuit result while others also affect the control flow, the function of a circuit. Hence these faults can render the operation completely useless.

This does not only result in very high error-rates at the output pins, which is usually unwanted even in Approximate Computing. What is even worse for Approximate Com-puting is that the behavior is usually not reproducible. The measured error-rates differ from run to run, even if the same set of test pattern is used and the same error proba-bilities are assigned to the flip-flops. The reason for this is that the propagation of faults in these registers is largely depending of the state and the transition of the state of the circuit. Clearly, the registers that behave like this are part of the control path of a circuit.

In most cases they are part of the state machines controlling the procedure of a circuit.

This is why in the following this initial approximation prerequisite will be referred to as a separation of data and control path. In most cases approximations can only be tolerated in those elements of a circuit that do only affect the data path. Approximations, hence faults, in the control path of the circuit usually result in very high error rates as the control flow is disturbed, which is why they are usually not qualified for an approxima-tion. Additionally, as the propagation of faults in the control path of a circuit depends in the actual state of the circuit it sometimes takes more time to observe the faults and sometimes less. This would make it very difficult during the analysis of the circuit to generate meaningful results and have a reproducible fault behavior. Figure 4.2 visualizes this problem. In this example faults have been injected into a hardware-based “spacewire”

0 5 10 15 20 25 30

Figure 4.2: Output error probability and variance of a “spacewire” implementation when in-jecting equally errors with a probability 0.0001 in both data- and control path [143]

implementation. Spacewire is a communications standard to interchange data in space applications [121]. Without having to know the details of the application, it is clear that this application consists obviously of a control path, that is for instance responsible for framing of data. The data path instead likely consists only of buffers, temporary storing

the data received and to be transmitted. In the experiment, errors have been injected equally into each flip-flop with an error probability of0.0001[137]. The emulation length has been set to 1,000,000 clock cycles. This trial has been repeated 50 times. The figures are showing the mean of the measured error probability at each output pin and their variance. The right figure is showing the measurements that have been made when injecting faults into each register, data and control path. The left figure is showing the results when injecting the faults only into the data path. First of all, one can see that the error probabilities are much higher when injecting faults also into the control path.

Note, that the scale of the y-axis differs. Furthermore, one can see that when injecting fault into the data path only, faults are only visible at the output pins with index 4-11.

The error probability at the other pins is0.0. Note that the error probability in the right figure for pins 12-24 is not zero but∼10⁻⁵. The 8 output pins that experience an error probability are the actual data pins, outputting the received payload. Furthermore, one can see that the variance of the determined error rate is much less, by a factor about 2∗10⁴. It looks like allowing approximations only in the data path of a circuit is the key to create a trustworthy and direct relationship between the degree of approximation and the effects on the outputs. When moving the focus from Approximate Computing to the classical reliability analysis one can infer that the control path of a circuit requires special protection, in order to be robust to soft errors due to radiation. The difficulty however is to identify which parts of a circuit correspond to data and which to the control path. If the circuit is small and if the functionality is clear the identification could be done manually. However, for most applications this identification has to be automated.

Two methods have been developed and evaluated that allow an automated identification of elements corresponding to data, respectively to the control path of a circuit.

For both methods it is required that the circuit has input and output pins that handle data and pins that are controlling the circuit. A generic diagram showing the separation of a circuit into control and data path is shown in Figure 4.3. In order to identify the members of the data, respectively the control path, the corresponding input and output pins are identified. Simplified one can define that those pins where faults can be tolerated correspond to the data path. These pins are usually containing data. Even for Approximate Computing it is unlikely that faults can be tolerated in control pins.

For example, the h.264 video decoder that has already been introduced in the previous chapter, outputs the decoded video data into an external memory. The interface therefore consists of data pins containing the video data and several control pins like address pins and “read/write enable” pins used for controlling the data output. In this example it is clear that the video output pins would belong to the data path and control pins to the control path. Clearly, this approach is only working for circuits that not only output data, but also have a control interface. For those circuits where this separation is not very clear, i.e. where the control path cannot be identified, another step is mandatory prior approximation that will be presented in Section 4.1.3.

4.1 Datapath Separation

Figure 4.3: Visualization of control and data flow of a generic circuit

4.1.1 Netlist-based Separation Approach

The first approach presented is based on analyzing the netlist of a circuit. Once the input and output pins of the circuit are categorized the idea is to follow the path of these pins through the circuit. By doing so, one can identify which elements have a connection to which output pins. This can be done by building an abstract syntax tree of the circuit, as it has been already done for the circuit instrumentation. Based on that tree one can then identify all nodes that do influence only data outputs and those that do also influence control outputs. This approach is visualized in Figure 4.4. One can already see that this separation technique is very strict. One single connection to a control output is sufficient to be categorized as control path element. It could happen however, that the influence of such an element to the control flow of the circuit is minimal even though it is technically part of it. The separation based on the second technique presented in the next section will therefore be less strict. However, this method based on the netlist clearly gives the most precise and sharp separation of control and data path. The netlist-based separation can also be performed without having access to a netlist parser. By exploiting a simple trick the synthesis and optimization capabilities of a regular synthesis tool can be used to separate the control from the data path. By leaving the input and output pins of the data path unconnected, the synthesis tool is usually removing all related components due to the circuit optimization. The resulting netlist is then comparing only control path related elements. By comparing the optimized netlist with the unoptimized one, the data path

Figure 4.4: Data and control path separation based on analyzing the netlist of a circuit

can be extracted. Clearly, this method is not very reliable as no details are known about the optimization mechanisms of commercial synthesis tools. However, it pointed out that this in most cases it is working well. Independent on how the netlist gets separated, this method has a superior benefit over the one presented in the next section. The netlist-based approach will by design not produce any false-positives. False-positive in this case means that registers are detected to be part of the data-path, even though they are control-path related registers. A false-positive could mean that it will be approximated, hence faults will be tolerated, even though the faults can be propagated to the control structures of the circuit. However, it is also possible that the influence of such a register on the control flow is only minimal and hence could be tolerated. The potential savings of approximating this register will then be lost. However, if an element has any connection to a control related output pin it will not be detected as control path element with the netlist-based method. It is possible that registers get classified as data path, even though they are technically part of the control path. This is only possible if that register has no further connection to the control path, which in turn means that faults in general can be tolerated. Similarly, if the control path is data dependent, as depicted in Figure 4.4, any register that has at least one connection to the control path will be classified as control

4.1 Datapath Separation path register.

The netlist-based approach hence is a very conservative separation, as every element that has at least one connection to a control path related output pin will be classified as a control path element, even though the influence is very small or not even possible due to masking effects. The netlist-based approach therefore serves as a baseline for the next emulation-based approach which is less strict and tries to allow approximations even in the control path if their effects are negligible.

4.1.2 Emulative Separation Approach

The emulative approach to detect the data and the control path related elements in a circuit uses the FPGA-based emulator itself. The initial step is the same as for the netlist-based approach. The output pins get classified into those where faults can be tolerated and those where no faults can be tolerated. Note, that at this early stage of approximation it is not required to distinguish in the rate of errors that can be tolerated.

At this stage of the approximation it is only interesting to know which elements can in general tolerate faults.

In the next step, the error probability of each register is consecutively set to 0.5, but only one at a time. Then, for each register the circuit is emulated for a certain number of clock cycles and the resulting error probability at the output pins is measured. By doing this for each register one can analyze which register in the circuit has an influence on which output pin. This result can be used to build a matrix with the dimensions numInj×numOut, showing the relation of approximations at the registers to the effects on the output pins. The value of numInj corresponds to the number of injection spots, i.e. usually the number of registers, and numOut the number of outputs. This matrix will be called in the following “Probability-Relation-Matrix” PRM. Each row is showing the error probability that could be observed at the outputs when the error probability of that specific element has been set to 0.5, and all others to 0.0. It is important to keep in mind that the matrix is only showing the relation when one single register has been approximated, not multiple at the same time. The procedure is again shown in Algorithm 1. This matrix can be graphically visualized making an interpretation more Algorithm 1 Determination of the “Probability-Relation-Matrix”

easy. Figure 4.5 is showing a plot of the PRM of an exemplary benchmark circuit, a

Im Dokument Automated Power Optimization of Sequential Integrated Circuits through Approximate Computing (Seite 104-110)