mardi 2 juillet 2019

Using Python 3.7.3, randomly select / choose from weighted list of files in a given directory

Using Python 3.7.3, I need to randomly choose from a weighted list of files in a given directory. Weights are determined by how new the file is and whether or not the user has marked as favorite (newer the file, more often it is selected.)

What is the most efficient way to set the weights? I want the behavior of my distribution of randomly chosen elements is the same as the distribution of weights in the list. The favorite flag will be stored in a dictionary of with the file names as the key, and true/false as the value.

Assume the number of items in the weights list must equal the number of elements in filesList, and that the list of weights must collectively add up to 1. Also, this is being run on a Raspberry Pi 3/4.

If another method is better than numpy.random.choice, I'm all for it.

I've looked into Randomly selecting an element from a weighted list.

import numpy, collections

#filesList = os.listdir('./assets')    

filesList= ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o'] 

count = len(filesList)

Favorites = {}
for key in Listy:
    Favorites[key] = random.choice([True, False])

weights = [   0.01666666666666667,
0.02666666666666667,
0.03666666666666667,
0.04666666666666667,
0.05666666666666667,
0.06666666666666667,
0.06666666666666667,
0.06666666666666667,
0.06666666666666667,
0.06666666666666667,
0.07666666666666667,
0.08666666666666667,
0.09666666666666667,
0.10666666666666667,
0.11666666666666667]

# Currently the code for setting the weights is commented out, as I'm not sure how to do it. Above is an example of distribution. 

#weights = [0 for x in range(count)]
#for x in range(count):
#    #offset = ?
#    weights[x-1] = 1/count #+ offset


print(f'Favorites: {Favorites}')
print('weights', weights)    

sum = 0     #sum of all weight values must be 1
for weight in weights:
    sum += weight

print(f'sum of weights: {sum}')

l = [numpy.random.choice(filesList, p=weights) for _ in range(10000)]

print(f'Results: {collections.Counter(l)}')




Aucun commentaire:

Enregistrer un commentaire