jeudi 5 juillet 2018

Jupyter Notebook vs Script giving different results, how to change random seed in notebook

I have a Python script for running 10-fold cross validation (sitting on top of sklearn). Inside the script I set the random seed to 42. Each time I run the script I get varying results from the cross validation. So I'm running 10 x 10-fold CV and get ten different accuracies.

I also have similar code in a Jupyter notebook. In a loop I run the 10-fold cross validation. For each loop I get the same results. This time I run 10 x 10-fold CV and get ten identical accuracies. The results only change if I restart the notebooks kernel. At that point I get ten identical accuracies which are different than before I restarted the kernel.

Question

  • Why am I getting consistent results inside the notebook but when running the a standalone script I get varying results?
  • What is happening at the kernel restart?

Attempted Solution

In my notebook I've tried:

  1. Setting os.environ['PYTHONHASHSEED'] = 'random' inside the loop.
  2. Reloading numpy with imp.reload(numpy) inside the loop.
  3. Running random.seed();numpy.random.seed()`.

None of these make a difference.




Aucun commentaire:

Enregistrer un commentaire