When trying to make a legacy library reproducible (set a certain seed and you'll always get the same result), I discovered a race condition bug which looks like this
#ifndef _OPENMP
#error "OpenMP support is required for this MCVE"
#endif
#include <Rcpp.h>
#include <random>
#include <omp.h>
using namespace std;
using namespace Rcpp;
// [[Rcpp::plugins(openmp)]]
// [[Rcpp::export]]
NumericVector random_vec(int length, int n_cores) {
/*
* Returns a random vector of LENGTH.
*/
NumericVector result(length);
// Set up seed for each thread
IntegerVector seeds(n_cores);
seeds = INT_MAX * runif(n_cores);
// !!!RACE CONDITION!!!
int elem1, elem2;
omp_set_num_threads(n_cores);
#pragma omp parallel shared(result)
{
// Seed each thread with a deterministic seed
mt19937_64 mt(seeds[omp_get_thread_num()]);
// I'm aware that WRE discourses the use of <random>, but I'm
// maintaining a legacy codebase and I don't want to change
// the existing code unless there is some drop-in replacement.
uniform_int_distribution<> r_unif(INT_MIN / 2, INT_MAX / 2);
#pragma omp for schedule(static) nowait
for (int i = 0; i < length; i++) {
// Threads overwrite the values set by each other
elem1 = r_unif(mt);
elem2 = r_unif(mt);
result[i] = elem1 + elem2;
}
}
return result;
}
While it has been fixed, the library have been used for various research projects, so I want to estimate the impact of this bug. A few simulations indicates that the distribution of random numbers generated don't change significantly with or without a race condition, but is there a case where it actually makes a difference?
Aucun commentaire:
Enregistrer un commentaire