I am training text data using bert. because train data set size is small, I want to do data simulation(like augmentation etc) I want to add random guassian noise to the sentence vector for each batch
below is the code.
# Add positional embeddings and token type embeddings, then layer
# normalize and perform dropout.
self.embedding_output = embedding_postprocessor(
input_tensor=self.word_embedding_output,
use_token_type=True,
token_type_ids=token_type_ids,
token_type_vocab_size=config.type_vocab_size,
token_type_embedding_name="token_type_embeddings",
use_position_embeddings=True,
position_embedding_name="position_embeddings",
initializer_range=config.initializer_range,
max_position_embeddings=config.max_position_embeddings,
dropout_prob=config.hidden_dropout_prob,
use_one_hot_embeddings=use_one_hot_embeddings)
self.embedding_output = add_gaussian_noise(self.embedding_output, std=0.5)
# my function for adding noise
def add_gaussian_noise(input_tensor, std) :
noise = tf.random.truncated_normal(shape=tf.shape(input_tensor),
mean=0.0, stddev=std, dtype=tf.float32)
return input_tensor + noise
But it seemed that for every batch same random guassian noise is added.
is it possible add random noise for every batch in tensorflow??
thanks.
Aucun commentaire:
Enregistrer un commentaire