mardi 4 septembre 2018

Selecting a random sample from a very large generator

I am trying to test some strategies for a game, which can be defined by 10 non-negative integers that add up to 100. There are 109 choose 9, or roughly 10^12 of these, so comparing them all is not practical. I would like to take a random sample of about 1,000,000 of these.

I have tried the methods from the answers to this question, and this one, but all still seem far too slow to work. The quickest method seems like it will take about 180 hours on my machine.

This is how I've tried to make the generator (adapted from a previous SE answer). For some reason, changing prob does not seem to impact the run time of turning it into a list.

def tuples_sum_sample(nbval,total, prob, order=True) :
    """ 
        Generate all the tuples L of nbval positive or nul integer 
        such that sum(L)=total.
        The tuples may be ordered (decreasing order) or not
    """
    if nbval == 0 and total == 0 : yield tuple() ; raise StopIteration
    if nbval == 1 : yield (total,) ; raise StopIteration
    if total==0 : yield (0,)*nbval ; raise StopIteration
    for start in range(total,0,-1) :
        for qu in tuples_sum(nbval-1,total-start) :
            if qu[0]<=start :
                sol=(start,)+qu
                if order :
                    if random.random() <prob:
                        yield sol
                else :
                    l=set()
                    for p in permutations(sol,len(sol)) :
                        if p not in l :
                            l.add(p)
                            if random.random()<prob:
                                yield p

Rejection sampling seems like it would take about 3 million years, so this is out as well.

randsample = []
while len(randsample)<1000000:
    x = (random.randint(0,100),random.randint(0,100),random.randint(0,100),random.randint(0,100),random.randint(0,100),random.randint(0,100),random.randint(0,100),random.randint(0,100),random.randint(0,100),random.randint(0,100))
    if sum(x) == 100:
        randsample.append(x)
randsample

Can anyone think of another way to do this?

Thanks




Aucun commentaire:

Enregistrer un commentaire