I am trying to use numpy's SeedSequence
to seed RNGs in different processes. However I am not sure whether I should use ss.generate_state
or ss.spawn
:
import concurrent.futures
import numpy as np
def worker(seed):
rng = np.random.default_rng(seed)
return rng.random(1)
num_repeats = 1000
ss = np.random.SeedSequence(243799254704924441050048792905230269161)
with concurrent.futures.ProcessPoolExecutor() as pool:
result1 = np.hstack(list(pool.map(worker, ss.generate_state(num_repeats))))
ss = np.random.SeedSequence(243799254704924441050048792905230269161)
with concurrent.futures.ProcessPoolExecutor() as pool:
result2 = np.hstack(list(pool.map(worker, ss.spawn(num_repeats))))
What are the differences between the two approaches and which should I use?
Using ss.generate_state
is ~10% faster for the basic example above, likely because we are serializing floats instead of objects.
Aucun commentaire:
Enregistrer un commentaire