lundi 29 avril 2019

Why is random.choice so imbalanced?

So I have been writing some very random experimental code, and I found the results I got kind of puzzling.

This is my code

from random import choice

class A:


    def __init__(self, l):
        parsed = iter(l)

        self.include = next(parsed) == "+"
        self.l = list(parsed)



    def get(self, l):
        rt = self.l
        if self.include:
            rt += l
        return choice(rt)



a = A("+abcd")

d = dict()
for key in "abcdef":
    d[key] = 0

for i in range(100000):
    d[a.get(["e", "f"])] += 1

print(d)

I expected that code to output a random but somewhat even distribution of choices. Something like this:

{'a': 16678, 'b': 16539, 'c': 16759, 'd': 16584, 'e': 16631, 'f': 16809}

But the actual output is this:

{'a': 3, 'b': 4, 'c': 7, 'd': 3, 'e': 49588, 'f': 50395}

I mean, it is random, but if that was for real I might as well have won the lottery 10 times by now.

So, what exactly is going on here? Why does the random.choice function prefer to choose "e" and "f" so much over the others?




Aucun commentaire:

Enregistrer un commentaire