I'd like to generate a large number n
(i.e., n >= 1,000,000,000
) of sorted and uniformly distributed random numbers in C++.
A first and simple approach I considered was to
- sequentially generate
n
uniformly distributed numbers using anstd::uniform_real_distribution<double>
, - and then sort them using
std::sort
.
However, this takes several minutes.
A second and more sophisticated approach was to do parallelize the two steps as in:
template <typename T>
void computeUniformDistribution(std::vector<T>& elements)
{
#pragma omp parallel
{
std::seed_seq seed{distribution_seed, static_cast<size_t>(omp_get_thread_num())};
std::mt19937 prng = std::mt19937(seed);
std::uniform_real_distribution<double> uniform_dist(0, std::numeric_limits<T>::max());
#pragma omp for
for (size_t i = 0; i < elements.size(); ++i)
{
elements[i] = static_cast<T>(uniform_dist(prng));
}
}
std::sort(std::execution::par_unseq, elements.begin(), elements.end());
}
However, even this takes about about 30
seconds. Given that the generation of the uniformly distributed numbers takes only about 1.5
seconds, the bottleneck is still the sort phase.
Hence, I'd like to ask the following question: How to efficiently generate uniformly distributed data in a sorted way?
Aucun commentaire:
Enregistrer un commentaire