Applied voltage over-scaling - 5 Approximation at Gate Level for Voltage Scaling

5 Approximation at Gate Level for Voltage Scaling

5.5 Applied voltage over-scaling

With the help of the methodology presented in the previous sections, it is now possible to perform the scaling of the supply voltage and estimate the resulting error rates. In the previous chapter it has been shown how the maximum error probability can be deter-mined, that can be tolerated at each register input. This information is the prerequisite in this chapter in order to perform the approximation by voltage over-scaling. Hence, it is assumed that the maximum error rates that can be tolerated for an approximated operating point are known for each register in the circuit. Due to the analytic scaling approach presented, the scaling operation is straight-forward as the time it takes to esti-mate the error rates is not critical. Hence, the most simple strategy is to simply reduce the nominal voltage of the whole fanin of an endpoint by a predefined δ and estimate the resulting error rate. If the error rate is below the maximum value, one can repeat the step, if not, the voltage has to be increased by δ and the scaling has to be stopped.

This scaling operation has to be performed for all timing endpoints in the circuit that

are voltage over-scaling candidates, i.e. that can tolerate an error probability larger than p_e>0.0. Once done, it has to be taken care of the “shared components”, as it has been describe in Section 5.3. Afterwards, the minimum voltage for each gate in the circuit is known. This information can then be used to define voltage islands, either statically defined at synthesis time, or dynamically at run-time. All these steps are comparably simple and in this work performed with the help of some TCL scripts.

The proposed methodology has been evaluated with the already introduced Sobel filter example application. The application has been approximated as it has been presented in the previous chapter. Three approximate operating points have been defined, resulting in an image quality (PSNR) of 30, 40 or 50 dB. There is practically no degradation of the visual quality visible between the three operating points, as it can be seen in Figure 5.6.

The circuit that is to be approximated in this example is the multiplication unit of the

no approx. 30 dB

40 dB 50 dB

Figure 5.6:Difference of the visual quality of a Sobel filtered still when applying no approxi-mation and an approxiapproxi-mation for target qualities 30, 40 and 50 dB

floating-point unit that the Sobel filter utilizes. For each of the three operating points a list has been generated stating the maximum error probability that can be tolerated at each register of the FPU (fmul). The circuit consists of 2030 flip-flops, i.e. timing endpoints. Each of these endpoints has been scaled individually, using the “simplified”

methodology presented in this chapter. Parallelizations have not been applied. The

5.5 Applied voltage over-scaling scaling analysis took about one hour on a regular desktop PC. The functionality of the circuit for the three operating points has been verified using timing annotated simula-tions, as presented earlier. Figure 5.7 is showing the estimated power consumption of the FPU for the three operating points. The power consumption has been estimated using Synopsys PrimeTime with accurate test pattern. One can see that compared to applying

30 40 50 no

Figure 5.7: Estimated power consumption of a floating point unit for different approximated operating points of a Sobel filter when applying voltage over-scaling [145]

no approximation at all, up to 27.9 % of the power consumption could be saved, when running the used benchmark. This is clearly an impressive result, even though a bit less than when applying “circuit pruning”. This is not surprising as circuit pruning tackles the static as well as the dynamic power consumption. Furthermore, 0.7V has been assumed to be the minimum voltage, as this is safely above the threshold voltage of the technol-ogy, even though often paths could have been operated even slower. Surprisingly, the lowest power consumption could be observed for the approximation point having the best quality. It could be seen already when approximating the circuit in the previous chapter that the operating points are very close together. The difference is not large enough to seen a significant difference in the power consumption between the operating points. The fact that the minimum power consumption is not at 30 dB but at 50 dB is likely caused by inaccuracies mainly during the coarse and the fine approximation. Nevertheless, the applicability of voltage over-scaling could be proven. The reduction of the power con-sumption by applying voltage over-scaling compared to applying no approximation at all is clearly visible. As it has been elaborated in Section 5.4 it is unlikely that many different voltage domains will be implemented on a chip due to the resulting overhead and complexity. The measurements have been made when allowing 2 supply voltages,

51 voltage domains and 501 voltage domains, as shown in Figure 5.7. For instance “VOS [0.7V 1.2V]” corresponds to voltage over scaling (VOS) with the two voltage domains 0.7V and 1.2V. Instead, “VOS 0.7V:0.1V:1.2V” corresponds to the voltage islands 0.7V, 0.8V, 0.9V, 1.0V, 1.1V and 1.2V, hence in steps of 0.1V. The minimum voltage has been set to 0.7 V and the nominal voltage of the technology is 1.2 V. One can see that the difference in the potential power savings is comparably small for different sizes of voltage domains. Clearly, the largest power savings are possible if almost no restrictions are made to the number of voltage domains. In this case almost no supply voltage has to be rounded up (unnecessarily) to the next supply voltage. However, even when operating only two supply voltages, the power savings are significant. The difference to operating 501 voltage domains is comparably small. This is a very promising result as it shows that even a very small number of voltage domains can be sufficient in order to successfully apply circuit approximation through voltage over-scaling. This even seems to make a dynamic assignment, at run-time, of gates to voltage domains, feasible as the routing overhead can be limited.

5.6 Summary

In this chapter a methodology has been presented that allows to analytically estimate the required supply voltages of each gate within a circuit in order to approach an approxi-mated operating point. The methodology is generically applicable to any sequential ASIC circuit. It is based on the commercial timing analysis software PrimeTime by Synopsys.

PrimeTime is used to estimate at which supply voltage, which timing path is failing.

Using the “Composite Current Source” Model, PrimeTime is able to perform a timing analysis for any applied supply voltage, even outside of the specifications. Based on this information, for each timing endpoint the resulting error probabilities are estimated using an methodology developed in this work. The methodology basically estimates the likelihood of a signal edge not being propagated to an endpoint due to masking effects.

Two variants have been presented, one presenting a more accurate estimation, and the other one presenting a rougher estimation offering a major simplification of the compu-tations. Experimental results have shown that the estimation of error rates is working fine in general, but has some limitations, as not all effects are covered. However, the pre-sented approach offers a simple and very fast way to estimate error rates due to voltage over-scaling. It has then been shown how his methodology could be used to practi-cally apply voltage over-scaling to approximate an exemplary circuit. The methodology allowed to approach the desired approximate operating points very closely and match the desired error rates. It could be shown that even very few voltage domains can be sufficient to gain a significant reduction of the power consumption. The results of the presented methodology are very promising. However, the approach is still in a very early development stage. There are many points at which it could be improved. Nevertheless, in this work the first automated analytic approach that can be generically applied has been presented. It therefore proves the practical feasibility of voltage over-scaling as an approximation technique for a wide range of applications.

Im Dokument Automated Power Optimization of Sequential Integrated Circuits through Approximate Computing (Seite 155-159)