mercredi 30 septembre 2020

Repeatable random numbers using dorng package

I am trying to generate a repeatable set of random numbers in R using foreach and dorng packages. My expectation is that I should get the same random numbers in both the cases below. However, that is not what I see.

Here is a brief overview of what I am trying to do.

  1. Set the seed
  2. In a loop, sleep for a random duration and get a list of 20 random numbers
  3. Save the first list of random numbers
  4. Repeat the process in steps 1 and 2 one more time with different random sleep duration.
  5. Save the second list of random numbers
  6. Compare the first and second lists of random numbers.

I was expecting that the first set of random numbers and the second set of random numbers to be the same. The order might be different because each parallel process could be sleeping for a random duration and might impact the overall order of random numbers picked. However, In the case below, I see that only 122 of the 200 numbers match. If I change the random sleep duration from 8 and 10 seconds to 8 and 12 seconds, the number of common random numbers goes down even more. I am using sleep as a proxy for computation and other processing that happens within the foreach loop.

Any help with what could be wrong here is greatly appreciated.

library(doParallel)
library(doRNG)
library(foreach)

# Make 10 clusters and set them to run in parallel
cl <- makeCluster(10)
registerDoParallel(cl)

## COLLECT THE FIRST SET OF RANDOM NUMBERS 
set.seed(123)
# Sleep duration for first sample collection
x <- 0:10

# Generate 10 random numbers after a random amount of sleep
rp1 <- foreach(i=1:20, .combine = 'c', .options.RNG=123) %dorng%{ 
  # Sleep for random duration
  Sys.sleep(sample(x))
  return(rnorm(10,0,1))
}

## RESET THE SEED AND COLLECT SECOND SET OF RANDOM NUMBERS
set.seed(123)
# Sleep duration for second sample collection
x <- 0:8

# Generate 10 random numbers after a random amount of sleep
rp2 <- foreach(i=1:20, .combine = 'c', .options.RNG=123) %dorng%{ 
  # Sleep for random duration
  Sys.sleep(sample(x))
  return(rnorm(10,0,1))
}

stopCluster(cl)

# Do a diff between the two sets of random numbers
common <- intersect(rp1,rp2)
print(length(common))



Aucun commentaire:

Enregistrer un commentaire