samedi 15 août 2020

How to Generate Sorted Uniformly Distributed Random Numbers Efficiently in C++?

I'd like to generate a large number n (i.e., n >= 1,000,000,000) of sorted and uniformly distributed random numbers in C++.

A first and simple approach I considered was to

  1. sequentially generate n uniformly distributed numbers using an std::uniform_real_distribution<double>,
  2. and then sort them using std::sort.

However, this takes several minutes.

A second and more sophisticated approach was to do parallelize the two steps as in:

template <typename T>
void computeUniformDistribution(std::vector<T>& elements)
{
    #pragma omp parallel
    {
        std::seed_seq seed{distribution_seed, static_cast<size_t>(omp_get_thread_num())};
        std::mt19937 prng = std::mt19937(seed);
        std::uniform_real_distribution<double> uniform_dist(0, std::numeric_limits<T>::max());

        #pragma omp for
        for (size_t i = 0; i < elements.size(); ++i)
        {
            elements[i] = static_cast<T>(uniform_dist(prng));
        }
    }

    std::sort(std::execution::par_unseq, elements.begin(), elements.end());
}

However, even this takes about about 30 seconds. Given that the generation of the uniformly distributed numbers takes only about 1.5 seconds, the bottleneck is still the sort phase.

Hence, I'd like to ask the following question: How to efficiently generate uniformly distributed data in a sorted way?




Aucun commentaire:

Enregistrer un commentaire