samedi 7 novembre 2015

Distribution of Number of Digits of Random Numbers

I encounter this curious phenomenon trying to implement a UUID generator in JavaScript.

Basically, in JavaScript, if I generate a large list of random numbers with the built-in Math.random() on Node 4.2.2:

var records = {};
var l;
for (var i=0; i < 1e6; i += 1) {
  l = String(Math.random()).length;
  if (records[l]) {
    records[l] += 1;
  } else {
    records[l] = 1;
  }
}
console.log(records);

The numbers of digits have a strange pattern:

{ '12': 1,
  '13': 11,
  '14': 65,
  '15': 663,
  '16': 6619,
  '17': 66378,
  '18': 611441,
  '19': 281175,
  '20': 30379,
  '21': 2939,
  '22': 282,
  '23': 44,
  '24': 3 }

I thought this is a quirk of the random number generator of V8, but similar pattern appears in Python 3.4.3:

12 : 2
13 : 5
14 : 64
15 : 672
16 : 6736
17 : 66861
18 : 610907
19 : 280945
20 : 30455
21 : 3129
22 : 224

And the Python code is as follows:

import random
random.seed()
records = {}
for i in range(0, 1000000):
    n = random.random()
    l = len(str(n))
    try:
        records[l] += 1
    except KeyError:
        records[l] = 1;

for i in sorted(records):
    print(i, ':', records[i])

The pattern from 18 to below is expected: say if random number should have 20 digits, then if the last digit of a number is 0, it effectively has only 19 digits. If the random number generator is good, the probability of that happening is roughly 1/10.

But why the pattern is reversed for 19 and beyond?

I guess this is related to float numbers' binary representation, but I can't figure out exactly why.




Aucun commentaire:

Enregistrer un commentaire