mardi 19 septembre 2017

Python's random.sample and stack overflow

I've been working on a large Python scientific project, and I'm encountering a Stack overflow issue, which basically involves random.sample() and multiprocessing. I have a Goo object which contains a large population of Foo, who seek to make friends. To do so, they pick p other Foo belonging to Goo randomly using random.sample(). Once they are done, the program stops.

It goes this way:

foo.py

class Foo(object):
    def __init__(self):
        self.friends = []

and goo.py:

from foo import Foo
import random

class Goo(object):
    def __init__(self, nfoo):
        self.foo_list = [Foo() for i in range(nfoo)]

    def sim_goo(self):
        for f in self.foo_list:
            candidates = random.sample(self.foo_list, 5)
            f.friends = candidates

and using Jupyter I run:

from main import do_sim
from multiprocessing import Pool
pool = Pool(processes = 2)
para_list = [1000, 1000]
result = pool.map_async(do_sim, para_list).get()

which raises a MaybeEncodingError: Error sending result: '[<goo.Goo object at 0x0000003B15E60A90>]'. Reason: 'RecursionError('maximum recursion depth exceeded while pickling an object',)'

As it works with para_list = [10, 10], I can only imagine that the error is raised because random.sample() gets too big to handle when the list it picks from is too large, which becomes problematic when using multiprocessing. But 1000 Foos isn't much.

Does anybody know an alternative?

Thanks for your time!

Best,




Aucun commentaire:

Enregistrer un commentaire