jeudi 23 avril 2020

Add random guassian node in batch using tensorflow

I am training text data using bert. because train data set size is small, I want to do data simulation(like augmentation etc) I want to add random guassian noise to the sentence vector for each batch

below is the code.

    # Add positional embeddings and token type embeddings, then layer
    # normalize and perform dropout.
    self.embedding_output = embedding_postprocessor(
        input_tensor=self.word_embedding_output,
        use_token_type=True,
        token_type_ids=token_type_ids,
        token_type_vocab_size=config.type_vocab_size,
        token_type_embedding_name="token_type_embeddings",
        use_position_embeddings=True,
        position_embedding_name="position_embeddings",
        initializer_range=config.initializer_range,
        max_position_embeddings=config.max_position_embeddings,
        dropout_prob=config.hidden_dropout_prob,
        use_one_hot_embeddings=use_one_hot_embeddings)

    self.embedding_output = add_gaussian_noise(self.embedding_output, std=0.5)

    # my function for adding noise
    def add_gaussian_noise(input_tensor, std) :
        noise = tf.random.truncated_normal(shape=tf.shape(input_tensor), 
                mean=0.0, stddev=std, dtype=tf.float32)
        return input_tensor + noise

But it seemed that for every batch same random guassian noise is added.

is it possible add random noise for every batch in tensorflow??

thanks.




Aucun commentaire:

Enregistrer un commentaire