jeudi 30 mars 2017

Tensorflow: running same computational graph with different random samples efficiently

In Tensorflow, the computational graph can be made to depend on random variables. In scenarios where the random variable represents a single sample from a distribution, it can be of interest to compute the quantity with N separate samples to e.g. make a sample estimate with less variance.

Is there a way to run the same graph with different random samples, reusing as many of the intermediate calculations as possible?

Possible solutions:

  • create the graph and the random variable inside a loop. Con: makes redundant copies of quantities that don't depend on the random variable.
  • Extend the random variable with a batch dimension. Con: a bit cumbersome. Seems like something TF should be able to do automatically.
  • Maybe the graph editor (in .contrib) can be used to make copies with different noise, but I'm unsure whether this is any better than the looping.
  • Ideally, there should just be an op that reevaluates the random variable or marks the dependency as unfulfilled, forcing it to sample a new quantity. But this might very well not be possible.

Example of the looping strategy:

import tensorflow as tf
x = tf.Variable(1.)
g = []
f = lambda x: tf.identity(x) #op not depending on noise 
for i in range(10):
    a = tf.random_normal(()) # random variable
    y = tf.pow(f(x)+a*x, 2) # quantity to be repeated for diff samples
    g += [y]

#here, the mean is the quantity of interest
m = tf.reduce_mean(g)

#the variance demonstrates that the samples are different
v = tf.reduce_mean(tf.map_fn(lambda x: tf.square(x-m), tf.stack(g)))

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    print(sess.run(g))
    print('variance: {}'.format(sess.run(v)))

if f(x) is an expensive function, I can only assume that the loop would make a lot of redundant calculation.




Aucun commentaire:

Enregistrer un commentaire