lundi 5 février 2018

How to shuffle an iterator each iteration but be deterministic across program runs

Question

I'm confused by the Tensorflow documentation for randomness as it relates to the documentation for tf.train.shuffle_batch.

Specifically, I'm looking for a way to maintain deterministic behavior across different runs so I can obtain the same results each time I run the program, but still want to shuffle my iterators at each new epoch.

Code Snippet

I'll keep the question general as it relates to the overall documentation, but I am specifically wondering why the following doesn't work:

tf.set_random_seed(hparams.graph_seed)
...
dataset = dataset.shuffle(hparams.shuffle_buffer_size, hparams.shuffle_seed, reshuffle_each_iteration=True)

Based on the documentation for tf.train.shuffle_batch, it seems like this should do what I want; the graph seed and shuffle seed are set, thus maintaining determinism across runs, but the iterator reshuffles every iteration.

The problem is that it doesn't reshuffle every time it is initialized. I note this possible duplicate, but then notice Tensorflow's own NMT Tutorial does not use a placeholder for a seed as suggested in the Github Issue, and does get shuffled results every time. Perhaps this is because they do not call tf.set_random_seed?

Extra Questions

What is the purpose of tf.set_random_seed if it only produces deterministic behavior when used in conjunction with an op-level seed?

Additionally, what is the purpose of the parameter seed in tf.train.shuffle_batch? Is it an op-level seed? How should these be used properly?




Aucun commentaire:

Enregistrer un commentaire