mercredi 6 juin 2018

Why does tempdir() give the same result on every core of a fork cluster?

I want to run several bootstrapping samples in parallel. The calculation involves creating a temporary directory for each sample. I use the package future with plan(multisession), which automatically creates a fork cluster on my linux machine to run the samples in parallel.

My problem is that tempdir() does not return different results for each sample, not even when I set.seed(.) differently for each core.

MCVE (this will not work on Windows, because Windows cannot fork()):

clu <- parallel::makeForkCluster(4)
parallel::clusterApply(clu, 1:4, 
  function(x){ set.seed(x); tempdir() })
## [1] "/tmp/Rtmp0uaUin" "/tmp/Rtmp0uaUin" "/tmp/Rtmp0uaUin" "/tmp/Rtmp0uaUin"

If I restart R, I get different results, but per session the return values are all equal.

On the other hand, other random functions work fine, at least if I include set.seed(x)

unlist(parallel::clusterApply(clu, x = 1:4, 
  function(x){ set.seed(x); rnorm(1) }))
##[1] -0.6264538 -0.8969145 -0.9619334  0.2167549

unlist(parallel::clusterApply(clu, x = 1:4, 
  function(x){ rnorm(1) }))
## [1] -1.100044 -1.100044 -1.100044 -1.100044

Why does tempdir() behave differently than other random functions, and what can I do about it?




Aucun commentaire:

Enregistrer un commentaire