vendredi 3 août 2018

C++ : Create unique random distributions 1000 times with no repeated numbers in any distribution

Problem:

I have a number range of 1 to 20,000. I would like to sample a uniform distribution of 8 different numbers from the range, 1000 times. Each distribution should have no repeated numbers. Also every distribution among the 1000 distributions must be unique (after doing alphabetical sort of all obtained distributions, no two distributions should be technically the same number sequence).

What I did:

I came across the std::uniform_int_distribution<int> in C++ 11 standard. So tried this following code to test its uniqueness over different ranges.

#include <iostream>
#include <random>

int main(int, char*[]) {
    const int range_from  = 0;
    const int range_to    = 20000;
    std::random_device                  rand_dev;
    std::mt19937                        generator(rand_dev());
    std::uniform_int_distribution<int>  distr(range_from, range_to);

    for(int i = 0; i < 1000; ++i){ //Iteration for maximum distributions

        for (int j = 0; j < 8; ++j) //Iteration for random distribution generation
            std::cout << distr(generator) << '\n';
    }
}

Output obtained for different attempts

Attempt 1:      Attempt 2:

16102           6656
540             10316
8718            12251
18059           1254
10976           3982
18391           12857
14748           9253
12137           3335

Some facts:

The output appears unique based on the two attempts. I also realize the fact that there can be only 20,000/8 = 2,500 such possible distributions which can be unique (no same numbers in each distribution and no repetition of distributions when we do an alphabetical sort).

My Question:

Can this code really do what I intend to do in its current form ? Or should I do some expensive verification process like doing an alphabetical sort of each distribution and record each each of them in an std::vector and check whether the distribution is already there ?

The latter process is very expensive and takes huge computational time. Would you suggest any alternatives/ better ways of doing it ?




Aucun commentaire:

Enregistrer un commentaire