mercredi 22 août 2018

Random batches of data from timeseries ... Tensorflow

I am trying to estimate a RNN in Tensorflow and need to create batches of data to feed the estimation process.

I want to feed random batches, but I need each random batch to contain uninterrupted data. So, each batch starts randomly in the time series, but contain - lets say 20 days - of uninterrupted data.

Below I have a tensorflow program that almost does the trick ... I get random batches, but each batch contains data that is random within the batch. Is it possible with a small change in the code to make each batch compose of uninterrupted data, please ?

import tensorflow as tf

num_epochs = 2

# create 2 simple data input 
inc_dataset = tf.data.Dataset.range(12)
dec_dataset = tf.data.Dataset.range(0, -12, -1)

# merge the two data sets
dataset = tf.data.Dataset.zip((inc_dataset, dec_dataset))

# the only "shuffler" I know in TF 
dataset = dataset.shuffle(buffer_size=10000)

# batches of size 4
dataset = dataset.batch(4)

# repeat the dataset by number of epochs
dataset = dataset.repeat(num_epochs)

# one-shot iterator
sess = tf.Session()
iterator = dataset.make_one_shot_iterator()
next_element = iterator.get_next()


while True:
    try:
        print(sess.run(next_element))
    except tf.errors.OutOfRangeError:
        break

The output will be:

(array([0, 3, 5, 4], dtype=int64), array([ 0, -3, -5, -4], dtype=int64))
(array([7, 8, 1, 6], dtype=int64), array([-7, -8, -1, -6], dtype=int64))
(array([ 9,  2, 11, 10], dtype=int64), array([ -9,  -2, -11, -10], dtype=int64))
(array([9, 0, 5, 3], dtype=int64), array([-9,  0, -5, -3], dtype=int64))
(array([4, 8, 1, 2], dtype=int64), array([-4, -8, -1, -2], dtype=int64))
(array([10,  6, 11,  7], dtype=int64), array([-10,  -6, -11,  -7], dtype=int64))

Thank you very much in advance.

Br.




Aucun commentaire:

Enregistrer un commentaire