I am trying to estimate a RNN in Tensorflow and need to create batches of data to feed the estimation process.
I want to feed random batches, but I need each random batch to contain uninterrupted data. So, each batch starts randomly in the time series, but contain - lets say 20 days - of uninterrupted data.
Below I have a tensorflow program that almost does the trick ... I get random batches, but each batch contains data that is random within the batch. Is it possible with a small change in the code to make each batch compose of uninterrupted data, please ?
import tensorflow as tf
num_epochs = 2
# create 2 simple data input
inc_dataset = tf.data.Dataset.range(12)
dec_dataset = tf.data.Dataset.range(0, -12, -1)
# merge the two data sets
dataset = tf.data.Dataset.zip((inc_dataset, dec_dataset))
# the only "shuffler" I know in TF
dataset = dataset.shuffle(buffer_size=10000)
# batches of size 4
dataset = dataset.batch(4)
# repeat the dataset by number of epochs
dataset = dataset.repeat(num_epochs)
# one-shot iterator
sess = tf.Session()
iterator = dataset.make_one_shot_iterator()
next_element = iterator.get_next()
while True:
try:
print(sess.run(next_element))
except tf.errors.OutOfRangeError:
break
The output will be:
(array([0, 3, 5, 4], dtype=int64), array([ 0, -3, -5, -4], dtype=int64))
(array([7, 8, 1, 6], dtype=int64), array([-7, -8, -1, -6], dtype=int64))
(array([ 9, 2, 11, 10], dtype=int64), array([ -9, -2, -11, -10], dtype=int64))
(array([9, 0, 5, 3], dtype=int64), array([-9, 0, -5, -3], dtype=int64))
(array([4, 8, 1, 2], dtype=int64), array([-4, -8, -1, -2], dtype=int64))
(array([10, 6, 11, 7], dtype=int64), array([-10, -6, -11, -7], dtype=int64))
Thank you very much in advance.
Br.
Aucun commentaire:
Enregistrer un commentaire