I have a Python script for running 10-fold cross validation (sitting on top of sklearn
). Inside the script I set the random seed to 42. Each time I run the script I get varying results from the cross validation. So I'm running 10 x 10-fold CV and get ten different accuracies.
I also have similar code in a Jupyter notebook. In a loop I run the 10-fold cross validation. For each loop I get the same results. This time I run 10 x 10-fold CV and get ten identical accuracies. The results only change if I restart the notebooks kernel. At that point I get ten identical accuracies which are different than before I restarted the kernel.
Question
- Why am I getting consistent results inside the notebook but when running the a standalone script I get varying results?
- What is happening at the kernel restart?
Attempted Solution
In my notebook I've tried:
- Setting
os.environ['PYTHONHASHSEED'] = 'random'
inside the loop. - Reloading
numpy
withimp.reload(numpy)
inside the loop. - Running
random.seed();
numpy.random.seed()`.
None of these make a difference.
Aucun commentaire:
Enregistrer un commentaire