mercredi 28 juillet 2021

What is the difference between SeedSequence.spawn and SeedSequence.generate_state

I am trying to use numpy's SeedSequence to seed RNGs in different processes. However I am not sure whether I should use ss.generate_state or ss.spawn:

import concurrent.futures

import numpy as np


def worker(seed):
    rng = np.random.default_rng(seed)
    return rng.random(1)


num_repeats = 1000

ss = np.random.SeedSequence(243799254704924441050048792905230269161)
with concurrent.futures.ProcessPoolExecutor() as pool:
    result1 = np.hstack(list(pool.map(worker, ss.generate_state(num_repeats))))

ss = np.random.SeedSequence(243799254704924441050048792905230269161)
with concurrent.futures.ProcessPoolExecutor() as pool:
    result2 = np.hstack(list(pool.map(worker, ss.spawn(num_repeats))))

What are the differences between the two approaches and which should I use?

Using ss.generate_state is ~10% faster for the basic example above, likely because we are serializing floats instead of objects.




Aucun commentaire:

Enregistrer un commentaire