jeudi 16 juin 2022

Decimal precission problems with runif

I'm running into issues when simulating low-probability events with runif in R, and wondering how to solve this.

Consider the following example for an experiment where we simulate values of TRUE with probability 5e-10 in a sample of size 10e9, and check if any of these samples got that value of TRUE. This experiment is repeated 10 times:

set.seed(123)
probability <- 0.0000000005
n_samples <- 1000000000
n_tries <- 10
for (i in 1:n_tries) {
  print(any(runif(n=n_samples, min=0, max=1) < probability))
}

Code above will run relatively fast, and nearly half of the experiment replicates will return TRUE as expected.

However, as soon as the probability becomes 5e-11 (probability <- 0.00000000005), that expectation fails and no TRUE values will be returned even if the number of replicates is increased (used n_tries <- 100 twice with no luck; the whole process took 1h running).

This means runif is not returning values with as many precision as 11 decimals. This was unexpected, as R to my understanding works with as much as 16 decimals of precision, and we might need to simulate processes with probabilities that small (around 15 decimals).

Is this why runif fails to provide the expected output? are there any other alternatives/solutions to this problem?

Thank you




Aucun commentaire:

Enregistrer un commentaire