I'm maintaining an R package named iRF, and a big problem is that it isn't reproducible. In other words, I cannot get the same result by setting set.seed
. For the purpose of this question, let's focus on the function RIT
. You don't need to figure out what it does; just look at the RNG handling part instead.
It is defined in R/RIT.R
, which calls either RIT_1class
or RIT_2class
depending on the input type. Both RIT_[1|2]class
functions are defined in src/ExportedFunctionsRIT.cpp
, which in turn calls helper functions defined in src/RITmain.h
and src/RITaux.h
.
I'm using Rcpp attributes, so randomness in RIT_[1|2]class
should be correctly handled by an implicit RNGScope
, as mentioned in this answer. However, this codebase is tricky to tackle in two ways,
- The functions
RIT_basic
andRIT_minhash
use// [[Rcpp::plugins(openmp)]]
. Fortunately, the original author gives each thread a separate seed, so hopefully, I can make it deterministic withseeds[i] = rand() * (i+1)
, yet you can tell this along isn't enough since I'm asking here.
// Set up vector of seeds for RNG
vector<unsigned int> seeds(n_cores);
for (int i=0; i<n_cores; i++) {
seeds[i] = chrono::high_resolution_clock::now().time_since_epoch().count()*(i+1);
}
- One of the functions,
CreateHT
usesrandom_device rd;
. I'm not familiar with C++ but a quick search reveals it generates "non-deterministic random numbers".
void CreateHt(...) {
// Ht is p by L
random_device rd; //seed for Random Number Generator(RNG)
mt19937_64 mt(rd()); //Use Mersenne Twister as RNG
...
shuffle(perm.begin(), perm.end(), mt);
...
}
From my understanding, both rand()
and random_device
are C++'s builtin random artifacts. How can I make them respect .Random.seed
?
Aucun commentaire:
Enregistrer un commentaire