samedi 27 octobre 2018

R random number generator faulty?

I was looking into the RNG of base R and was curious if the 32-bit implementation of Mersenne-Twister might be limiting it when scaled to large numbers of random numbers needed so I did a simple test:

set.seed(8)
length(unique(runif(1e8)))
# [1] 98845641
1e8 - 98845641
# 1154359

So it turns out that there are indeed numerous duplicates in the 100 million draw.

When I switch to the 64-bit version of the MT RNG implemented by dqrng package, the problem does not appear.

Question 1:

The 64 bit referenced refers to the type of floating point numbers used?

Question 2:

Am I right to conclude that because of the large span of possible numbers (64bit FP vs 32bit FP), duplicates are less likely when using the 64-bit MT?




Aucun commentaire:

Enregistrer un commentaire