vendredi 31 juillet 2020

Problem to generate random unique numbers from fixed population length

I have a problem here which is: I need to generate random numbers given a fixed length, and every time I generate those numbers, I need to check if it was already seen or not.

Example: my fixed population size is 2.000.000. So, for example, in the first round of my algorithm, my sample size is 400.000. I need to generate 400.000 over 2.000.000. After generating those random numbers, I save them in a HashSet.

In a second rand of my algorithm, let's say I want to generate 20.000 random numbers, but I need to check with those 20.000 numbers was already seen or not by looking at the HashSet (which contains the 400.000 initial numbers from the 1 round).

This is what I got so far:

1 round: 
         population size: 2.000.000
         sample: 400.000
         List<Integer> result = new ArrayList<Integer>(sample);

         I save the numbers in a variable called perm[],so :
         int perm[] = new int[population]

  public List<Integer> generateRandomNumbers (int population, Set<Integer> setListStringSeen, int sample)
  {
   for (int i = 0; i < sample; i++)  
   {
      // random integer between i and population-i
      k = i + (int) (Math.random() * (population - i));
                    
     if(setListStringSeen.contains(k))
     {    
         // the problem here is: when I check here and if the newly generated number
         // was already see, I need to generate again a new number. But in this case,
         // the next number need to be checked again, because it could be seen too.
         // how can I end up this loop of checking?

        k = i + (int) (Math.random() * (population - i));
        
        if(setListStringSeen.contains(k))
        {
            System.out.println("we've choose this number once before");
        }
        
        setListStringSeen.add(k);           
        
      }
    
      int t = perm[k];
      perm[k] = perm[i];
      perm[i] = t;
    
   }

   for (int i = 0; i < sample; i++)
   {
      result.add(perm[i]);
   }

    at the end of 1 round, I add all the generated numbers in a HashSet:

   setListStringSeen.addAll(result);

   return result;

}

Now let's go to the 2 round: let's say we want to generate 20.000 new numbers: what I want is, check if those numbers the will be generated (in the second round) was already seen before by checking the Hashset variable. Any idea on how to do it?




Aucun commentaire:

Enregistrer un commentaire