mercredi 24 octobre 2018

How do I get random objects from a dictionary, weighted by value

I have a large dictionary. The keys are objects and the values are how often the given object appears in my data.

I would like to randomly choose an object from the dictionary but have the choice be weighted towards objects with higher corresponding values.

So far, I have been able to achieve this by adding x number of objects to a list where x is the corresponding value in the dictionary. Then I call random.choice() on this list. Like so:

import random

myDict = { 'foo' : 10,
           'boo' : 5,
           'moo' : 3,
           'roo' : 2,
           'goo' : 1,
           'oo' : 0}

selection = []
for obj in myDict.keys():
    for n in range(myDict[obj]):
        selection.append(obj)

To make sure that this is working I've run random.choice() on the list 10000 times and saved the results. Here are 4 of the results I've gotten.

{'foo': 4841, 'boo': 2397, 'moo': 1391, 'roo': 907, 'goo': 464, 'oo': 0}
{'foo': 4771, 'boo': 2410, 'moo': 1435, 'roo': 917, 'goo': 467, 'oo': 0}
{'foo': 4815, 'boo': 2340, 'moo': 1431, 'roo': 953, 'goo': 461, 'oo': 0}
{'foo': 4718, 'boo': 2443, 'moo': 1404, 'roo': 947, 'goo': 488, 'oo': 0}

As you can see, the distribution fits the frequency described in the dictionary.

My problem is that in my production code I have thousands of dictionaries each containing thousands of objects. The dictionaries are of variable length. My current method is very inefficient and slow. Is there a better way? I don't mind using a different structure to store the data as it comes in.




Aucun commentaire:

Enregistrer un commentaire