I would like to know if there are any best practices to optimize random environment. Currently I use this simple structure in my config :
from numpy.random import Generator, PCG64
rng = Generator(PCG64(42))
np.random.seed(42)
rng
generator: all general purposes (draw following a certain distribution, permutation of index, synthetic datapoints, etc.)- legacy
np.random.seed
to set the random state of scipy for thervs
method ofscipy.stats
generators.
I read somewhere in the sklearn doc (warning section here) that the sklearn.model_selection
module uses the global seed that scipy, that would be the global seed set with np.random.seed
isn't it ?
If you have a better understanding of how scipy and sklearn refers to the global seed and what would be a good default randomization setup, it would be very usefull. I already read the doc related to this but there is contradictory indications : for consistence one should pass a seed to the random_state
parameter each time (with the np.random.RandomState()
class) but they also say that if None
is passed it will look for the global np seed. However, with this last option, I can't clearly see a consistent behaviour and the second option is very verbose.
Any idea ?
Aucun commentaire:
Enregistrer un commentaire