In the example below (Tensorflow 2.0), we have a dummy tensorflow dataset with three elements. We map a function on it (replace_with_float
) that returns a randomly generated value in two copies. As we expect, when we take elements from the dataset, the first and second coordinates have the same value.
Now, we create two "slice" datasets from the first coordinates and the second coordinates, respectively and we zip the two datasets back together. The slicing and the zipping operations seems inverses of each other, so I would expect the resulting dataset to be equivalent to the previous one. However, as we see, now the first and second coordinates are different randomly generated values.
Maybe even more interestingly, if we zip the "same" dataset with itself by df = tf.data.Dataset.zip((df.map(lambda x, y: x), df.map(lambda x, y: x)))
, the two coordinates will also have different values.
How can this behavior be explained? Perhaps two different graphs are constructed for the two datasets to be zipped and they are run independently?
import tensorflow as tf
def replace_with_float(element):
rand = tf.random.uniform([])
return (rand, rand)
df = tf.data.Dataset.from_tensor_slices([0, 0, 0])
df = df.map(replace_with_float)
print('Before zipping: ')
for x in df:
print(x[0].numpy(), x[1].numpy())
df = tf.data.Dataset.zip((df.map(lambda x, y: x), df.map(lambda x, y: y)))
print('After zipping: ')
for x in df:
print(x[0].numpy(), x[1].numpy())
Sample output:
Before zipping:
0.08801079 0.08801079
0.638958 0.638958
0.800568 0.800568
After zipping:
0.9676769 0.23045003
0.91056764 0.6551999
0.4647777 0.6758332
Aucun commentaire:
Enregistrer un commentaire