vendredi 5 mai 2023

How can I randomly select values from a very large number of permutations without causing memory issues?

I am solving one machine scheduling problem and the data set is very large. I need to obtain a subset of permutations (with a pre-defined selected_number: How many numbers I want to get from the permutations) from the permutation results, and the selected permutations need to be randomized.

for example, if my machine 1 has 15 jobs, then all possible permutations would be 15! It is impossible to append all the sequences in the list for the large number of permutations, it may cause a lack of memory.

perms = list(permutations(self.machines[m]))
selected_perms = random.sample(perms, num_selected)

So I try in two ways: (1)Randomly select permutations and add to a list (select all machines), ensuring randomness: But the first way is pretty slow, I have no idea why does it happened.

selected_perms = []
                    
while selected_perms.__len__() < num_selected:
  job_perms =  random.sample((list(self.machines[m])), 15)
  if job_perms not in selected_perms:
    selected_perms.append(job_perms)    

(2) Shuffle the machine sequence first, and then select a subset of the permutations. However, this way does not perform well because the randomness is poor.

perms =  random.sample((list(self.machines[m])), job_count)
selected_perms = list(islice(permutations(perms), num_selected))

(3)I have an idea to improve method 2. First, Shuffle the machine sequence first and select smaller selection permutations. Then, repeat this process multiple times to reach the total selection number. However, I believe that this approach may only yield slightly better results than method 2.

I would appreciate any suggestions you may have!




Aucun commentaire:

Enregistrer un commentaire