I have a large dictionary. The keys are objects and the values are how often the given object appears in my data.
I would like to randomly choose an object from the dictionary but have the choice be weighted towards objects with higher corresponding values.
So far, I have been able to achieve this by adding x number of objects to a list where x is the corresponding value in the dictionary. Then I call random.choice()
on this list. Like so:
import random
myDict = { 'foo' : 10,
'boo' : 5,
'moo' : 3,
'roo' : 2,
'goo' : 1,
'oo' : 0}
selection = []
for obj in myDict.keys():
for n in range(myDict[obj]):
selection.append(obj)
To make sure that this is working I've run random.choice()
on the list 10000 times and saved the results. Here are 4 of the results I've gotten.
{'foo': 4841, 'boo': 2397, 'moo': 1391, 'roo': 907, 'goo': 464, 'oo': 0}
{'foo': 4771, 'boo': 2410, 'moo': 1435, 'roo': 917, 'goo': 467, 'oo': 0}
{'foo': 4815, 'boo': 2340, 'moo': 1431, 'roo': 953, 'goo': 461, 'oo': 0}
{'foo': 4718, 'boo': 2443, 'moo': 1404, 'roo': 947, 'goo': 488, 'oo': 0}
As you can see, the distribution fits the frequency described in the dictionary.
My problem is that in my production code I have thousands of dictionaries each containing thousands of objects. The dictionaries are of variable length. My current method is very inefficient and slow. Is there a better way? I don't mind using a different structure to store the data as it comes in.
Aucun commentaire:
Enregistrer un commentaire