vendredi 8 septembre 2017

Fixing the seed for parallel simulation runs with different number of cores

I'd like to parallelize a simulation study to speed it up and I'd also like to account for reproducibility. In particular, I'd like to obtain the same result as if I used set.seed at the beginning of a sequential simulation run. Here is an example how I try to set it up (I purposefully use .inorder=T here):

library(doSNOW)
library(rlecuyer)

nr.cores = 4
nr.simulations = 10 
sample.size = 100000

seed = 12345

cl = makeCluster(nr.cores)
registerDoSNOW(cl)
clusterExport(cl=cl, list=c('sample.size'), envir=environment())
clusterSetupRNGstream(cl,rep(seed,6))

result = foreach(i=1:nr.simulations, .combine = 'c', .inorder=T)%dopar%{
  tmp = rnorm(sample.size)
  tmp[sample.size]
}

stopCluster(cl)

print(paste0('nr.cores = ',nr.cores,'; seed = ',seed,'; time =',Sys.time()))
print(result)

There are two questions that I have after running this example several times:

  1. The number of cores impacts the resulting sequence, e.g., for nr.cores=1 and 4 only the first values coincide, and for nr.cores=4 and 8 the first four values coincide. Is there a way to have it independent of the nr.cores? Conceptually, I’d imagine I could create an RNG stream of size nr.simulations * sample.size, split it to nr.simulations pieces and distribute them to the nodes always in the same order. Even simpler, I could fix nr.simulations values of (different) seeds and again pass them in a fixed order to the nodes. This could be done having some kind of node mapping which could be used by the nodes to read its appropriate seed value from a table. Is there a way to do it?

  2. When I run the script several times it happens (not always but from time to time) that the resulting sequence is reordered even though I do not change any of the parameters (I just source the file again and again). It just looks like a bug to me as either .inorder or clusterSetupRNGstream fail. Or am I missing something?

    [1] "nr.cores = 4; seed = 12345; time =2017-09-08 19:00:24"
    [1]  1.327091137 -1.800244293 -1.163391460  0.005980001  0.957521136  1.641354433 -1.219033091
    [8] -0.238129356 -0.225193384  1.457018576
    
    [1] "nr.cores = 4; seed = 12345; time =2017-09-08 19:00:28"
    [1]  1.327091137 -1.800244293 -1.163391460  0.005980001 -0.238129356  0.957521136  1.641354433
    [8] -1.219033091  0.870269174 -0.225193384
    
    



Aucun commentaire:

Enregistrer un commentaire