Failure Point 3 Experiment: Oversampling RNG

6.2 Random Number Generator Failure Experiments

6.2.7 Failure Point 3 Experiment: Oversampling RNG

The previous experiments examined possible failure points in the random number source or from outside interference. They only cover two of the possible three failure points mentioned in the simulator introduction (see Section 6.1). This last experiment examines the tests sensitivity to a defective digitizer.

The experiment is broken into two parts with the first investigating extreme oversampling where every bit is repeated. The second section investigates the effect of a whole 24-bit word being repeated. Figure 6.40 shows an example of both oversampling failures.

Bit oversampling: 101110→11 00 11 11 11 00

Word oversampling: 101 001 111→101 101 001 001 111 111 Figure 6.40: Bit and word oversampling error example.

Oversampling RNG Implementation

Both the bit- and word- repeating RNG are modifications of the Matlab^TMbinary random number generator. They have been implemented by storing either one bit or a full binary word from the Matlab^TMgenerator in a temporary variable. The data in the variable is stored twice during the assembly of the 100000 bit sequence. The place holder counter is advanced after each bit or word storage to prepare it for the next input. This process is repeated until all the 100000 bits are created. The RNG is reinitialized and the full process is run 500 times to create the full sample sequences.

Results for the Bit Oversampling RNG

Examing the results from the bit repeating oversample experiment (see Figures 6.42 to 6.44), it is apparent that the error is quickly identified. The “pass” counting graph (Figure 6.41) shows the FIPS 140-2 test standard rejects all the sequences and labels the generator as nonrandom. It is easier to identify the sequences that do not recognize an error compared to the FIPS matching chart. The three tests that do not recognize the presence of an error are the frequency, turning point and the autocorrelation tests. For the frequency test, the full sequence can be represented

114 CHAPTER 6. EMPIRICAL TEST QUALITY MEASUREMENT

0 50 100 150 200 250 300 350 400 450 500 550

25 50 75 100

250 500

1000 2500

5000 10000

150 00

20000 300

00 500

00 100000 Sequence Length

Countof"Pass"Sequences(max.500)

Frequency Runs Longest Runs Poker Turning Point Autocorrelation Frequency Block Serial FIPS @ 20000

Figure 6.41:Single test “pass” count for the bit oversampling generator.

as two half sized random sequences. Comparing these two sequences together still provides the correct number of ones and zeros for the test to pass the sequence.

The implementation of the autocorrelation test examines bits with another bit four time units delayed. In this case the correlation is in the neighbouring bit not the further delayed bit. A more complex autocorrelation test that examines the correlation of the neighbouring bits up to a given value would be a more powerful autocorrelation test, but would increase the hardware requirements.

The turning point test also does not catch the error in the experiment. The evidence for the turning point test indicates that doubling the bits does not change the number of peak and troughs enough to indicate an error.

A closer examination of Figure 6.42 shows the poker test is the most sensitive to the over-sampling error with it perfectly matching the FIPS standard at a sequence sample length of 75 bits. The runs and serial test are both close behind with their perfect FIPS matching occurring at sequence lengths of 100 and 250 respectively.

Four tests have perfect FIPS matching: poker, runs, serial and frequency block tests. This is also thefirst that has the longest runs test indicate an error with any degree of sensitivity, even though it does not reach 100% matching in the sequence lengths selected for the experiment.

The poker test is an ideal test for finding this error, since the doubling of the bits means some patterns happen more often than others. The example in Figure 6.40 shows how the

begin-6.2. RANDOM NUMBER GENERATOR FAILURE EXPERIMENTS 115

0 10 20 30 40 50 60 70 80 90 100

25 50 75 100

250 500

1000 2500

5000 10000

15000 20000

30000 50000

100000 Sequence Length

PercentMatching

Frequency Runs Longest Runs Poker Turning Point Autocorrelation Frequency Block Serial Test

Figure 6.42:Single test percent matching with FIPS 140-2 results for the bit oversampling generator.

ning “10” combination becomes the four bit “11 00” combination. In this situation the patterns

“0000”, “0011”, “1100” or “1111” occur exclusively. The patterns with mixed values i.e. “01”

or “10”, do not occur at all.

The single tests show very high quality in matching to the FIPS standard at very small test bit lengths. The test combinations have been included to see if the 75 bit level can be lowered to 50 or even 25 bits. The test combination results can be seen in Figures 6.43 and 6.44. Thefirst point of interest is there are no test combinations with a sample sequence length of 50 having 100% FIPS matching. However, there are two tests groups that provide better coverage at the 50 bit sample length than is possible from the single poker test. The runs-poker test shows approximately a 23% improvement in the error detection at a 72% FIPS matching. The runs-serial test has a 64%

FIPS matching, which is a 15% improvement over the poker test. This indicates the runs test is finding the generator faulty through different sequences than the poker test. The fail result is not overlapping, allowing for an improved group combination. This improvement is good; however, it does not reach the magical 100% that is desired by smart card manufacturers.

Results 24-bit Word Oversampling

The results from the bit oversampling show that the poker, runs and serial tests are very sensitive to this type of error, however, the word oversample is another error that may manifest itself. In comparison to the bit oversampling this error is far more subtle. The graphical results from the

116 CHAPTER 6. EMPIRICAL TEST QUALITY MEASUREMENT

0 10 20 30 40 50 60 70 80 90 100

25 50 75 100

250 500

1000 2500

5000 10000

150 00

20000 300

00 50000

100000 Sequence Lengths

PercentMatching

F_R_FIPS F_L_FIPS F_P_FIPS F_T_FIPS F_A_FIPS F_FB_FIPS F_S_FIPS R_L_FIPS R_P_FIPS R_T_FIPS R_A_FIPS R_FB_FIPS R_S_FIPS L_P_FIPS

Figure 6.43:Test combination percent matching with FIPS 140-2 results for the bit oversampling gener-ator showing the combinations Frequency/Runs to Longest Runs/Poker.

0 10 20 30 40 50 60 70 80 90 100

25 50 75 100

250 500

1000 2500

5000 10000

150 00

20000 30000

50000 100000 Sequence Lengths

PercentMatching

L_T_FIPS L_A_FIPS L_FB_FIPS L_S_FIPS P_T_FIPS P_A_FIPS P_FB_FIPS P_S_FIPS T_A_FIPS T_FB_FIPS T_S_FIPS A_FB_FIPS A_S_FIPS FB_S_FIPS

Figure 6.44:Test combination percent matching with FIPS 140-2 results for the bit oversampling gener-ator showing the combinations Longest Runs/Turning Point to Frequency Block/Serial.

6.2. RANDOM NUMBER GENERATOR FAILURE EXPERIMENTS 117

0 50 100 150 200 250 300 350 400 450 500 550

25 50 75 100

250 500

1000 2500

5000 10000

15000 20000

30000 50000

100000 Sequence Length

Countof"Pass"Sequences(max.500)

Frequency Runs Longest Runs Poker Turning Point Autocorrelation Frequency Block Serial FIPS @ 20000

Figure 6.45:Single test “pass” count for the word oversampling generator.

word oversample experiment is shown in Figures 6.46 to 6.48. Figures 6.46 and 6.45 display the results for the single tests. It is important to note that the FIPS test group only marks approxi-mately 440 samples as coming from a faulty source. It is also seen in Figure 6.45 that no single test matches FIPS at 20000. A better picture of this can be seen in Figure 6.46 where it is evident that none of the test reach perfect FIPS matching. An interesting phenomena occurs with both the poker and runs test. They both decrease in quality at the lower sequence bit lengths and only at 15000 do they improve beyond the singles test. The results from these two graphs do no seem to match. The poker and runs test both fail more sequences than the other tests and appear to be closer to the FIPS total in Figure 6.46. However, Figure 6.45 shows that this is not the case and the poker and runs tests have a lower matching than the other tests. The individual data have been investigated and the results indicate that the poker and runs test fail more sequences than the other tests but they are different than the ones failed by the FIPS group at 20000. For example, using a test sequence of 100 bits, sequences 15 and 19 are marked as fail by the poker test while at 20000 bits, the FIPS group marks sequence 13 and 21 as fails. Therefore, even with the higher failure rates the poker and runs test have a lower FIPS matching percent.

The single tests do not reach perfect matching with the FIPS standard, this leaves room for the test combinations to possibly provide perfect matching. The results are shown in Figures 6.47 and 6.48. Most of the combinations follow the dominate test of the combination, however, the runs-poker test display worse results than the individual tests until the 15000 bit test length, at that point the test matches the poker test result. Only at the 20000 bit length does the test combination

118 CHAPTER 6. EMPIRICAL TEST QUALITY MEASUREMENT

0 10 20 30 40 50 60 70 80 90 100

25 50 75 100

250 500

1000 2500

5000 10000

15000 20000

30000 50000

100000 Sequence Length

PercentMatching

Frequency Runs Longest Runs Poker Turning Point Autocorrelation Frequency Block Serial Test

Figure 6.46:Single test percent matching with FIPS 140-2 results for the word oversampling generator.

reach 100% FIPS matching. The reason for the worse results, other than at the 15000 and 20000 test lengths, is the same reason as is given for the individual tests. The test combinations are marking “fail” to different sequences than the FIPS standard. The perfect matching at 20000 shows that the runs and poker are the two important tests for this experiment.

Oversampling Conclusion

The oversampling experiment shows two extremes in the random generator testing, one test with a very distinct failure detection and the other with very little detection. For both failure models, the poker and runs test are the primary tests. The poker test can be set to a test length of 75 for the bit oversampling; however, the only reliable test for the word oversampling is the poker-runs combination at 20000 bits.

6.2. RANDOM NUMBER GENERATOR FAILURE EXPERIMENTS 119

0 10 20 30 40 50 60 70 80 90 100

25 50 75 100

250 500

1000 2500

5000 10000

15000 20000

30000 50000

100000 Sequence Lengths

PercentMatching

F_R_FIPS F_L_FIPS F_P_FIPS F_T_FIPS F_A_FIPS F_FB_FIPS F_S_FIPS R_L_FIPS R_P_FIPS R_T_FIPS R_A_FIPS R_FB_FIPS R_S_FIPS L_P_FIPS

Figure 6.47:Test combination percent matching with FIPS 140-2 results for the word oversampling gen-erator showing the combinations Frequency/Runs to Longest Runs/Poker.

0 10 20 30 40 50 60 70 80 90 100

25 50 75 100

250 500

1000 2500

5000 10000

15000 20000

30000 50000

100000 Sequence Lengths

PercentMatching

L_T_FIPS L_A_FIPS L_FB_FIPS L_S_FIPS P_T_FIPS P_A_FIPS P_FB_FIPS P_S_FIPS T_A_FIPS T_FB_FIPS T_S_FIPS A_FB_FIPS A_S_FIPS FB_S_FIPS

Figure 6.48:Test combination percent matching with FIPS 140-2 results for the word oversampling gen-erator showing the combinations Longest Runs/Turning Point to Frequency Block/Serial.

120 CHAPTER 6. EMPIRICAL TEST QUALITY MEASUREMENT

121

Chapter 7 Random Number Generator Testing Unit

7.1 Hardware and Software Analysis

The previous two sections examined both the hardware implementation and quality characteris-tics of the selected eight tests. In this chapter the two separate results are used to determine an efficient test unit design with perfect FIPS matching capabilities.

Using the hardware implementation results from Chapter 5 the RNG tests are categorized into two groups: simple and complex tests. Simple tests have low hardware requirements, while complex tests require more area and power. The complex tests usually perform more complex calculations. The division of the RNG tests into the two groups allows for an easy overview showing which tests can be combined with each other and still have low hardware requirements.

Simple tests can be combined with complex tests, since they do not add significantly to the overall test unit requirements. For complex tests, combining two such tests leads to a very large test unit and/or high power consumption. They are best left as single tests or, if required, combined with simple tests. The tests are shown in Table 7.1. The designs have been separated mainly on the power consumption and area requirements with some consideration to the time delay. The cut-off line for the implemented tests is the runs test, which means the poker and serial tests are considered complex with the rest being simple.

The simple/complex design rule does not allow for a poker-serial test unit combination, since

Simple Complex

Frequency Poker

Runs Serial

Longest Runs Frequency Block

Turning Point Autocorrelation

Table 7.1:Simple and complex tests based on hardware requirement results.

122 CHAPTER 7. RANDOM NUMBER GENERATOR TESTING UNIT

RNG Test RNG Test

100% 1 2 3 Detection

ANSI C Poker Poker Combos Hard

20000 20000

Repeating Runs-Poker Runs Poker Detectable

Pattern 1000 2500 2500

Bias Frequency Frequency Combos Serial Detectable

52% 20000 20000 30000

Bias Frequency Poker Serial Detectable

54% 10000 10000 10000

Frequency Add Freq-Poker Hard

Narrow 50% 20000

Frequency Add Poker Poker Combos Runs Detectable

Narrow 90% 15000 15000 30000

Frequency Add Poker Poker Combos Frequency Detectable

Wide 50% 10000 10000 20000

Frequency Add Poker Serial Poker or Serial Combos Detectable

Wide 90% 2500 2500 2500

Pink Poker Poker Combos Serial Detectable

Noise 1000 1000 1000

Oversample Poker Poker Combos Runs Detectable

bit 75 75 250

Oversample Runs-Poker Hard

word 20000

Table 7.2: Top 3 tests for perfect FIPS 140-2 matching.

both of these tests are considered complex. When selecting the test or tests for the end unit, the ideal design is a single simple test that covers all the failure models presented in the last chapter.

However, also acceptable are simple tests combinations, a complex test, or a simple and a single complex test combination.

Having divided the RNG tests into two hardware categories the next step is to examine the simulator results tofind out the best sample length and to answer which tests are to be included in the test unit to provide perfect FIPS 140-2 matching. A table has been compiled with the top three tests for each each failure model and the lowest bit testing length to achieve the perfect FIPS matching (see Tables 7.2).

Examing the perfect FIPS matching table reveals that the poker test is constantly, with one exception, in the top three list, either as a single test or as part of a test combination. It is best able to match the FIPS standard. For some of the more subtle errors, it requires a second test to reach full matching. From this result the conclusion is drawn that it is important to include the poker test in the test unit design.

The next step before deciding on using a second test is selecting a sequence test length and

7.1. HARDWARE AND SOFTWARE ANALYSIS 123 examing the results with just the poker test. The last column in Table 7.2 is a rating of the failure model judging how difficult it was for the top tests to match with the FIPS standard. The models used to examine the RNG tests have been set at different levels of interference or “non-randomness”. For example, an RNG with a ones bias of 52% produces sequences, statistically seen, that are close to what a true RNG would produce. The minor deviation from a true RNG only slightly increases the number of ones, meaning most sequences still pass any statistical test.

They do not fall outside the given acceptance range. A classification has been given to each of the models, where a “hard” is defined for error models that the FIPS standard had trouble detecting.

This data is obtained by examining the FIPS test count graph for each of the models and labeling any failure model as hard that does not have a pass count of zero for the FIPS group. For models with a FIPS count at zero, the model is said to be “detectable.”

The results in Table 7.2 show that three of the models are hard for the FIPS standard to determine: ANSI C, frequency addition of a narrow single frequency at 50%, and the word oversampling. These three models will not be used towards determining thefinal design.

From the remaining models the range of the sample sequence length is from 75 to 20000 bits.

A decision needs to be made that provides a compromise between the best statistical coverage and a possible implementation. It has been stated that a length of 20000 bits is too long for the generation and testing of bits during the initialization phase in smart cards. Therefore, any model that requires 20000 bits test length can not be 100% covered by this test unit. This decision has the effect that the 52% bias generator is not fully covered.

The next step down in the bit sequence length is 15000 bits to cover the frequency addition of a narrow signal at 90%. A 97% FIPS matching is achieved for 10000 bits for this failure, but this is not enough for perfect FIPS matching. Therefore, for the remaining failure models a choice has to be made. If the coverage of the narrow signal interference is important then the test bit length needs to be 15000 bits. However, this error is not very common in practice with wider signal interference being the norm. A sample length of 10000 bits can be used that would cover the following tests at 100% FIPS matching:

• Repeating pattern

• Bias 54%

• Frequency addition of a wide signal at 50%

• Frequency addition of a wide signal at 90%

• Pink noise

• Bit oversample

and the single frequency addition at 90% is covered with a 97% FIPS matching accuracy. This length provides the best compromise between testing time and test accuracy.

124 CHAPTER 7. RANDOM NUMBER GENERATOR TESTING UNIT

Test Percentage FIPS matching Improvement

Bit Oversampling 100%

-Word Oversampling 86% no improvement

Frequency Addition Wide 50% 100%

-Frequency Addition Wide 90% 100%

-Frequency Addition Narrow 50% 38.8% Freq-Poker 50.2%

Frequency Addition Narrow 90% 97%

-Bias 52% 22.6% Freq-Poker 82.4%

Bias 54% 100%

-Bias 56% 100%

-Pink noise 100%

-Repeating Pattern 100%

-ANSI C 54.4%

-Table 7.3:Poker test results for each faulty generator with a test sequence of 10000 bits.

At this point, the design has a 10000 bit sample sequence length and the poker test. The next step is to see if better coverage can be achieved by adding a second simple test.

Table 7.3 shows the coverage for each of the failure models using only a poker test with a sequence length of 10000 bits. Also included in Table 7.3 is any improvement by adding the next best test. For two of the generators, ANSI C and narrow frequency addition fault generator with 50% interference ratio, the fault detection cannot be improved by applying a second test.

However, for the 52% biased, word sampling, and single frequency addition at 50% generators the results show that the fault detection is improved with the addition of a second test. The 52%

biased RNG shows a significant improvement over only the poker test, while the single frequency addition RNG has at least a 50% chance of catching the failure. The word oversampling RNG is negligible in its improvement.

The frequency test is a simple test, and combining it with the poker test adds little to the hardware characteristics. To confirm this, the poker-frequency test combination has been imple-mented in hardware. There is added circuit logic due to the extra structures required to control both tests and make afinal pass/fail judgment. This is minor compared to the requirements from the tests themselves. The design is set to a maximum of 50 MHz, which allows Synopsys^TMextra optimization room to improve the hardware requirements. Hopefully, this keeps the hardware requirements close to that of the poker test’s requirements.

7.2 Poker-Frequency Test Unit

The test unit has been laid out as shown in Figure 7.1. The two tests selected are the poker and frequency test with the control keeping track of the results from both tests. A pass is only allowed when both tests agree on the sequence coming from a random source. The control logic

7.2. POKER-FREQUENCY TEST UNIT 125 Poker Test Test Unit FIPS Unit

Area (0.25μ^mCMOS technology) 524179μ^m² ⁵³⁰⁹⁸²μ^m² ⁵⁸⁸⁷⁰⁷μ^m²

Time Delay 17.28 ns 17.28 ns 17.19 ns

Power Consumption at 20 ns (50 MHz) 5.159 mW 5.541 mW 8.909 mW

Controller 34.2μ^W ^34.1μ^W

Poker Test 4.852 mW 4.743 mW

Frequency Test 0.654 mW 0.645 mW

Runs Test - 2.797 mW

Longest Runs Test - 0.690 mW

Table 7.4: Hardware characteristics of the Online RNG Test Unit.

is shown in Figure 7.2. The controller and the tests wait for the start bit to indicate when the test unit should begin. After thefirst test is finished it sets theFinished signal high, the controller then knows to read the result line for that test. In this case the first test read is the frequency test followed by the poker test. If both tests agree on a pass then theUnit_Resultsignal is set high, else it is left low. TheFinished is also set high to indicate that a result is sitting on the Unit_Resultline. The unit examines a test sequence of 10000 bits.

The implementation of the test unit is programmed using VHDL and, as with the other tests, has been synthesized using Synopsys^TM tools. Since current smart cards run at a maximum of 50 MHz and the current poker test design has a maximum around 50 MHz , the design has also been optimized to function at this speed. This allows for a higher time delay, which means dy-namic power and area savings are possible. The time delay has been examined only to make sure that the unit design is capable of operating at the 50 MHz mark, and that the extra functionality does not require a slower operating speed.

The results for the hardware design are shown in Table 7.4. The Synopsys^TMtools are able to optimize the test unit to be close to the original poker test. The optimization tools in Synopsys^TM are able to produce a test unit design that is close in size, power consumption, and speed to the original poker tests even with the added structures.

Table 7.4 also includes the results from the FIPS test group unit. It is designed in the same way as the Test Unit but it includes the runs and longest runs test. Running both units through the same experiments have given a result for area, time delay and power consumption at 50 MHz.

The results show that a 10% saving in area and a 38% saving in power consumption is achieved by using the test unit over the FIPS test unit.

The results indicate that there exists a design that is acceptable for smart card implementa-tions. Therefore, a test unit can be provided that achieves perfect FIPS matching for the error models previously covered that does not require the full FIPS group or the full 20000 bit test length. The test unit provides excellent coverage, is small, and has low power consumption.

Im Dokument Improving Security for Elliptic Curve Implementations on Smart Cards: A Random Number Generator Test Unit (Seite 123-149)