mardi 31 mars 2015

How do I interpret the results from dieharder for great justice

This is a question about an SO question; I don't think it belongs in meta despite being sp by definition, but if someone feels it should go to math, cross-validated, etc., please let me know.


Background: @ForceBru asked this question about how to generate a 64 bit random number using rand(). @nwellnhof provided an answer that was accepted that basically takes the low 15 bits of 5 random numbers (because MAXRAND is apparently only guaranteed to be 15bits on at least some compilers) and glues them together and then drops the first 11 bits (15*5-64=11). @NikBougalis made a comment that while this seems reasonable, it won't pass many statistical tests of randomnes. @Foon (me) asked for a citation or an example of a test that it would fail. @NikBougalis replied with an answer that didn't elucidate me; @DavidSwartz suggested running it against dieharder.


So, I ran dieharder. I ran it against the algorithm in question



unsigned long long llrand() {
unsigned long long r = 0;

for (int i = 0; i < 5; ++i) {
r = (r << 15) | (rand() & 0x7FFF);
}

return r & 0xFFFFFFFFFFFFFFFFULL;
}


For comparison, I also ran it against just rand() and just 8bits of rand() at at time.



void rand_test()
{
int x;
srand(1);
while(1)
{
x = rand();
fwrite(&x,sizeof(x),1,stdout);
}

void rand_byte_test()
{
srand(1);
while(1)
{
x = rand();
c = x % 256;
fwrite(&c,sizeof(c),1,stdout);
}
}


The algorithm under question came back with two tests showing weakenesses for rgb_lagged_sum for ntuple=28 and one of the sts_serials for ntuple=8.


The just using rand() failed horribly on many tests, presumably because I'm taking a number that has 15 bits of randomness and passing it off as 32 bits of randomness.


The using the low 8 bits of rand() at a time came back as weak for rgb_lagged_sum| 2


My question(s) is:



  1. Am I interpretting the results for 8 bits of randomly correctly, namely that given that one of the tests (which was marked as "good"; for the record, it also came back as weak for one of the dieharder tests marked "suspect"), rand()'s randomness should be marginally suspected.

  2. Am I interpretting the results for the algorithm under test correctly (namely that this should also be marginally suspected (perhaps twice as much) as just rand()

  3. Given the description of what the tests that came back as weak do (e.g for sts_serial looks at whether the distribution of bit patterns of a certain size is valid), should I be able to determine what the bias likely is

  4. If 3, since I'm not, can someone point out what I should be seeing?





Aucun commentaire:

Enregistrer un commentaire