Failure Point 1 Experiment: Repeating Pattern Random Number Generator 83

6.2 Random Number Generator Failure Experiments

6.2.4 Failure Point 1 Experiment: Repeating Pattern Random Number Generator 83

6.2. RANDOM NUMBER GENERATOR FAILURE EXPERIMENTS 83

0 50 100 150 200 250 300 350 400 450 500 550

25 50 75 100

250 500

1000 2500

5000 10000

150 00

20000 300

00 500

00 100000 Sequence Length

Countof"Pass"Sequences(max.500)

Frequency Runs Longest Runs Poker Turning Point Autocorrelation Frequency Block Serial FIPS @ 20000

Figure 6.8: Single test “pass” count for the ANSI C random number generator.

6.2.4 Failure Point 1 Experiment: Repeating Pattern Random Number

84 CHAPTER 6. EMPIRICAL TEST QUALITY MEASUREMENT

0 50 100 150 200 250 300 350 400 450 500 550

25 50 75 100

250 500

1000 2500

5000 10000

15000 20000

30000 50000

100000 Sequence Length

Countof"Pass"Sequences(max.500)

Frequency Runs Longest Runs Poker Turning Point Autocorrelation Frequency Block Serial FIPS @ 20000

Figure 6.9:Single test “pass” count for the repeating pattern generator.

In this study only the effects using a 100 bit initial random source have been examined. This provides a good starting point to determine the sensitivity for each of the tests. Lengthing the ini-tial sequence would shift the detection point upwards to a higher test sequence length; whereas, a shorter initial sequence would have the opposite effect and reduce the detection sequence length.

The 100 bit initial sequence has been chosen as a good compromise.

Results

The results from the repeating pattern RNG experiment can be seen in Figures 6.10 to 6.12. Since the repeating pattern RNG uses a 100 bit length sample sequence from the true RNG, the testing starts at the 250 bit length. The test sequence length of 100 bits and smaller should achieve the same pass/fail results as has been obtained in the true RNG experiment, Figure 6.9 shows that this is the case.

The single test experiment results are divided into five groups with the poker and runs test being in thefirst group, the serial test in the second group, the frequency and turning point test, then the autocorrelation and frequency block test, and finally the longest runs test. The fifth group is dropped from further study, since it does not recognize any sequence as faulty and does not help to improve the results from the other tests using test combinations.

The group of greatest interest is the first one. Examing Figure 6.9 reveals that the FIPS standard at 20000 bits rejects all the tests. Further study of the “pass” count also shows that both the runs and poker test fail all the sample sequences. Both these two tests are part of the FIPS

6.2. RANDOM NUMBER GENERATOR FAILURE EXPERIMENTS 85

0 10 20 30 40 50 60 70 80 90 100

25 50 75 100

250 500

1000 2500

5000 10000

15000 20000

30000 50000

100000 Sequence Length

PercentMatching

Frequency Runs Longest Runs Poker Turning Point Autocorrelation Frequency Block Serial Test

Figure 6.10:Single test percent matching with FIPS 140-2 results for the repeating pattern generator.

group and either one may be the most sensitive test for this particular failure model. A closer look at the percent matching to FIPS should show that test groupings with either the poker or the runs test should have the best results. This is examined in the coming paragraphs.

Another point of interest is the autocorrelation test, since it is currently the one used in pro-duction for smart cards. For this fault, the test begins to detect sequences as being from a nonran-dom generator at the 500 bit sample sequence length. The test achieves 40% matching with the FIPS at 10000 bits where it levels off and does not show any more improvement. This indicate that this test is not very good for this fault type.

The FIPS standard rejects all the sample sequences; therefore, this experiment is an analysis of how quickly each test or test groups reject the sample sequences, shown in Figure 6.10. The best group detects a fault in 20% of the sequences with a length of 250 bits. An improvement in fault recognition is obtained when the test length is increased. At a test length of 500 bits there is a 75% and 85% matching with the FIPS for runs and poker respectively. The tests almost achieve 100% matching when the test sequences have a length of 1000 bits. However, only at 2500 bits sequences do the tests catch all the samples.

The poker and runs test both work by counting the occurrence of either patterns or runs lengths. A closer look at the most sensitive test, the poker test, reveals when analyzing the initial subsequence certain patterns occur more often than others; however, the pattern counts are still in the acceptable range. Ideally the number of occurrence of patterns should be equal for all possible patterns in a countable infinite sequence. Since the analyzed data is only afinite sample

86 CHAPTER 6. EMPIRICAL TEST QUALITY MEASUREMENT

this does not occur and one type of pattern occurs more often than another. If the acceptable range for the 25 bit test sequence is also increased in comparison to the test sequence length (for example from 25 to 20000 bits), then the acceptable range is larger than that properly calculated for a significance level at 20000 bits. Therefore, more tests will be accept at 25 bits as coming from a random source when in fact they should be rejected. Here the distribution model fails to accurately portray the distribution from a true random source.

The poker test is ahead of the runs test in FIPS matching percentage until the 2500 bit test length, where they both achieve 100% matching. The second group, the serial test, has a slower matching percentage gain than thefirst group. There is a significant increase between the length of 500 and 2500; however, after this point the test slows down in catching the faulty sequences.

The serial test is also a pattern matching test which explains why it is initially good at rejecting the samples sequences. However, the last 5% of the faulty sample sequences can not be recognized due to the shorter pattern analysis. Even though it is counting the patterns the serial test is more concentrating on near bit correlation. The last 5% of the samples have very little correlation in the initial subsequence and even with the repeating of the subsequence the test still remains in the acceptance range. It does improve when the test sequence length is increased to 50000 bits and higher, but it still does reach 100%.

The last test to reach 100% within the given sample test lengths is the turning point test.

It matches the FIPS standard, but only starting at 50000 bits. This test is still included in the combination test because it counts a different characteristic than the other tests, and it may have caught samples at lower test lengths that the other tests did not catch.

The results from the combination test can be seen in Figures 6.11 and 6.12. The one test combination that shows good improvement, at least initially, is the runs/poker test group. There is approximately a 12 percentage point increase at the 250 bit test length, and approximately 10 percentage point increase at the 500 bit length. After this point, the percent matching does not differ from the single tests, which indicates that starting from the 1000 bit test length the runs and poker test are catching the same sequences. Therefore, if 100% matching is required, no improvement is achieved by using a combination of tests.

Conclusion

The repeating pattern error is one type of error that may arise from a faulty RNG. To cover this security hole a test or test group needs to be implemented that detects this fault with the smallest sample sequence length.

The standard, FIPS 140-2 group with a sample sequence of 20000 bits, is able to recognize that all the sample sequences come from a faulty generator. At the 20000 bit point, there are two tests that reject the same samples at 20000 bits, the poker and runs test. These two tests accurately model the FIPS standard at lower testing lengths, but reach their limits at a testing length of 2500 bits. The accuracy below that point degrades.

6.2. RANDOM NUMBER GENERATOR FAILURE EXPERIMENTS 87

0 10 20 30 40 50 60 70 80 90 100

25 50 75 100

250 500

1000 2500

5000 10000

15000 20000

30000 50000

100000 Sequence Length

PercentMatching

F_R_FIPS F_L_FIPS F_P_FIPS F_T_FIPS F_A_FIPS F_FB_FIPS F_S_FIPS R_L_FIPS R_P_FIPS R_T_FIPS R_A_FIPS R_FB_FIPS R_S_FIPS L_P_FIPS

Figure 6.11:Test combinations percent matching with FIPS 140-2 results for the repeating pattern gener-ator showing the combinations Frequency/Runs to Longest Runs/Poker.

0 10 20 30 40 50 60 70 80 90 100

25 50 75 100

250 500

1000 2500

5000 10000

15000 20000

30000 50000

100000 Sequence Length

PercentMatching

L_T_FIPS L_A_FIPS L_FB_FIPS L_S_FIPS P_T_FIPS P_A_FIPS P_FB_FIPS P_S_FIPS T_A_FIPS T_FB_FIPS T_S_FIPS A_FB_FIPS A_S_FIPS FB_S_FIPS

Figure 6.12:Test combinations percent matching with FIPS 140-2 results for the repeating pattern gener-ator showing the combinations Longest Runs/Turning Point to Frequency Block/Serial.

88 CHAPTER 6. EMPIRICAL TEST QUALITY MEASUREMENT

After examining the combination tests where 100% matching is obtained, it can be seen that there is no improvement over the single tests. Therefore, for the repeating pattern failure a single test, either the runs or poker test, with a testing sequence length of 2500 bits is recommended.

6.2.5 Failure Point 1 Experiment: Bias Random Number Generator

Description

Another possibleflaw in cryptographic random number generators happens when the generator loses the characteristic of equal probability of a zero or one being produced. Generators that do not have a probabilityP(X =1) =0.50 are labeled biased. There is a variety of causes for bias, for example, a malfunction in the generator hardware, environmental stress, or external influ-ences on the generator by a hacker. An experiment has been included in the simulator to show the sensitivity of the random number generator tests to this type of fault. The experiment uses a biased RNG to create sample sequences with biases of 52%, 54% and 56%. These bias values indicate the probability of the generator producing a one. These example biases probabilities have been chosen to show the sensitivity of each of the tests to this type of failure, and to give an indication how the tests react to an increasing bias error. The 50% generator (proper functioning generator) has not been mentioned in this part, since it is a normal working Matlab RNG, which has been tested with the given RNG tests.

For the normal operation of the Matlab RNG in other experiments the generator produces a sequence of bits of a given length. However, this time the bit outputs need to be influenced, so the generator is set to create sequence with values between 0 and 99. This value is compared to the selected bias value (i.e. 52, 54, or 56), and if it is less than this limit, then a one is produced.

Should it fall above the limit, then a zero is outputted.

This generator was used to create 500 samples of 100000 bits for the simulator. As mentioned previously, the bias selected for this experiment was 52%, 54%, and 56%. Each of the sample sequences was tested with the eight RNG tests, and with sequence lengths from 25 to 100000 bits.

Results

The results for this generator can be seen in Figures 6.15 to 6.18. It is assumed that with a bias a certain number of tests will pick up the faulty generator. A major question is how quickly can the error be identified (sequence length)? Looking at Figure 6.13 shows that the FIPS standard does not pass any of the sample biased sequences. Therefore, for the other tests to match with the FIPS standard they need to label all the samples sequences as fails. At the 20000 point, three tests that are part of the FIPS group (Frequency, Runs and Poker) plus the serial have rejected all of the sample sequences. Since the FIPS group has failed all the sample sequences, the percent

6.2. RANDOM NUMBER GENERATOR FAILURE EXPERIMENTS 89

0 50 100 150 200 250 300 350 400 450 500 550

25 50 75 100

250 500

1000 2500

5000 10000

15000 20000

30000 50000

100000 Sequence Length

Countof"Pass"Sequences(max.500)

Frequency Runs Longest Runs Poker Turning Point Autocorrelation Frequency Block Serial FIPS @ 20000

Figure 6.13:Single test “pass” count for the 54% biased generator.

matching for the single tests is more a count of how many fails the test processes at the sample sequence length.

For the 54% bias generator using the single tests (see Figure 6.14), the frequency and serial tests begin to label the generator as a fail at a sequence length of 500. As mentioned in the Description, a bias in a generator either produces more ones or zeros. So, for a 54% ones bias a generator will statistically produce 54 ones for every 100 bits. For small sample sequence lengths, it is not possible to pick up a 54% biased generator, since it falls into the acceptable range. For example, at a test sequence of 25 bits has 12.5 ones as the 50% point with 11 to 14 ones being in the acceptable range. If this is extended to 20000 bit sequences the acceptance range is then 8800 to 11200 which may not match the significance level anymore. For the serial test, the increase in the number of ones also increases the likelihood of the sequence pattern “11”

happening, with this coming at the expense of the “00” pattern.

The frequency and serial tests are the most sensitive tests for the biased random number gen-erator with 54%. They almost reach 100% matching with the FIPS standard at a sequence length of 5000; however, it only fully matches at the 10000 bit length mark. The slight change in the FIPS matching percentage between 5000 and 10000 indicates that actual 100% FIPS matching point is between these two values. An experiment has been run with a test sequence length be-tween 5000 and 10000 tofind a more accurate point of where the tests reach 100%. The point where the frequency and serial test match 100% with the FIPS standard is with a test sequence length of 8000. The poker test reached 100% at 10000 bits (see Table 6.2 for the results between

90 CHAPTER 6. EMPIRICAL TEST QUALITY MEASUREMENT

0 10 20 30 40 50 60 70 80 90 100

25 50 75 100 250 500 1000 2500 5000 10000 15000 20000 30000 50000 1E+05 Sequence Length

PercentMatching

Frequency Runs Longest Runs Poker Turning Point Autocorrelation Frequency Block Serial Test

Figure 6.14:Single test percent matching with FIPS 140-2 results for the 54% biased generator.

Test/Bit Length 5000 6000 7000 8000 9000 10000

Frequency 98.2% 99.4% 99.8% 100% 100% 100%

Runs 56.2% 73.4% 86.6% 92% 97.2% 98.6%

Poker 70.4% 86% 95.6% 98.4% 99.6% 100%

Serial 98% 99.4% 99.6% 100% 100% 100%

Table 6.2:The four tests percent matching to FIPS that have been zoomed in between a test sequence of 5000 and 10000.

5000 and 10000).

The next sensitive group for the 54% bias includes the poker and runs test. This test group exhibits approximately the same error identification rate as the first group but at one sequence length grouping higher, i.e. a 2500 bit sample length before errors are detected instead of 1000 bits for thefirst group. It also copies thefirst group by plateauing around the 98% FIPS matching and then reaching 100%. The runs and poker tests have also been tested analyzed between 5000 and 10000. Table 6.2 shows the poker test reaching 100% matching at the 10000 bit length, but the runs test still does not reach it.

The results from the 52% and 56% bias RNG (see Figures 6.15 and 6.16) show that the same trend applies to both a higher and lower bias. The same grouping of tests is present in all three bias generator results with the serial and frequency tests being the best group and poker and runs tests making up the second group. It is assumed that the lower biased generator is harder to detect, hence shifting the detection sequence length upwards. The same thought holds for

6.2. RANDOM NUMBER GENERATOR FAILURE EXPERIMENTS 91

0 10 20 30 40 50 60 70 80 90 100

25 50 75 100

250 500

1000 2500

5000 100

00 15000

20000 300

00 50000

100000 Sequence Length

PercentMatching

Frequency Runs Longest Runs Poker Turning Point Autocorrelation Frequency Block Serial

Figure 6.15:Single test percent matching with FIPS 140-2 results for the 52% biased generator.

the 56% biased RNG with it being easier to detect resulting in the detection sequence being shifted down. The results shown in Figures 6.15, 6.14, 6.16 backup this assumption, where the detection point for the 52%, 54% and 56% biased RNGs are at 2500, 500 and 250 sequence length respectively. Also the 100% FIPS matching level is sooner reached for the higher biased generators.

Up to this point each of the tests have been investigated separately; however, if the tests do not matching 100% with the FIPS standard, then there is room for improvement by combining results of two or more tests. The results of the two test combinations are shown in Figures 6.17 and 6.18. Combination of three tests have also been performed but are not included in this thesis due to little change being seen in the results between two and three test combinations.

The test combination results can be seen in Figures 6.17 and 6.18. Most of the test combina-tions do not show any improvement in their results over the single tests due to test masking. This happens when the more sensitive test not only fails all the same sequences as the second test but also a few more. The result on the chart show grouping points around single test results. There is, however, one test that does show an improvement for some bit lengths, the Frequency-Serial test group. There is a 2% percent matching improvement to the FIPS at the 1000 and 2500 bit lengths. This improvement does not continue beyond this point and the Frequency-Serial test again matches the results from the single Frequency and Serial tests.

92 CHAPTER 6. EMPIRICAL TEST QUALITY MEASUREMENT

0 10 20 30 40 50 60 70 80 90 100

25 50 75 100

250 500

1000 2500

5000 10000

15000 20000

30000 50000

100000 Sequence Length

PercentMatching

Frequency Runs Longest Runs Poker Turning Point Autocorrelation Frequency Block Serial Test

Figure 6.16:Single test percent matching with FIPS 140-2 results for the 56% biased generator.

0 10 20 30 40 50 60 70 80 90 100

25 50 75 100

250 500

1000 2500

5000 10000

15000 20000

30000 50000

100000 Sequence Lengths

PercentMatching

F_R_FIPS F_L_FIPS F_P_FIPS F_T_FIPS F_A_FIPS F_FB_FIPS F_S_FIPS R_L_FIPS R_P_FIPS R_T_FIPS R_A_FIPS R_FB_FIPS R_S_FIPS L_P_FIPS

Figure 6.17:Test combinations percent matching with FIPS 140-2 results for the 54% biased generator showing the combinations Frequency/Runs to Longest Runs/Poker.

6.2. RANDOM NUMBER GENERATOR FAILURE EXPERIMENTS 93

0 10 20 30 40 50 60 70 80 90 100

25 50 75 100

250 500

1000 2500

5000 10000

15000 20000

30000 50000

100000 Sequence Lengths

PercentMatching

L_T_FIPS L_A_FIPS L_FB_FIPS L_S_FIPS P_T_FIPS P_A_FIPS P_FB_FIPS P_S_FIPS T_A_FIPS T_FB_FIPS T_S_FIPS A_FB_FIPS A_S_FIPS FB_S_FIPS

Figure 6.18:Test combinations percent matching with FIPS 140-2 results for the 54% biased generator showing the combinations Longest Runs/Turning Point to Frequency Block/Serial.

Conclusion

A biased sequence is one possible failure that can come directly from the RNG. A properly functioning cryptographic RNG has an equal probability of producing a one or a zero, P(x= 1) =0.5. To ensure that the generator is functioning properly and/or is not being influenced, it needs to be tested for this particular failure before operation.

The results in the last section show that a bias as low as 52% can be detected using the FIPS test. For each of the biased generators the best tests are the Frequency and Serial tests. There is an improvement at lower test sequence lengths (1000 to 2500) when they are combined; however, this improvement does not push the tests to full FIPS matching. It is recommended that either the Serial or Frequency test be included in any online test unit for cryptographic RNGs.

Selecting the testing length is hard, since each of the tests have different sensitivity levels. A good compromise is the 10000 bit sequence length. The selected tests can catch both the 54%

and 56% bias with 100% match to the FIPS standard while at 52% bias there is still an 82%

success rate. This test sequence length is significantly lower than the 20000 bits for the FIPS, but still provides good testing for the given bias levels.

Afinal test selection and sequence length for the test unit suggestion is provided in thefinal conclusion where the results from the hardware and software analysis are combined.

94 CHAPTER 6. EMPIRICAL TEST QUALITY MEASUREMENT

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000

0 500 1000 1500 2000 2500 3000 3500 4000

data 2

0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000

0 500 1000 1500 2000 2500 3000 3500 4000

data 1

a) Single frequency example. b) Wide frequency group example.

Figure 6.19:Frequency spectrum of single frequency and wide frequency group example.

Im Dokument Improving Security for Elliptic Curve Implementations on Smart Cards: A Random Number Generator Test Unit (Seite 93-104)