I have a model that should train with 25000 data in 50000 epochs. I want to train with percentage of datasets for percentage of epochs for example it trains for 10 first epoch only 1000 random data then for 10 next epoch, 1000 random data..... My source code in part of dataloder is in follow.
class DataModule(pl.LightningDataModule):
def __init__(self, train_dataset, val_dataset, batch_size = 2):
super(DataModule, self).__init__()
self.train_dataset = train_dataset
self.val_dataset = val_dataset
self.batch_size = batch_size
def train_dataloader(self):
return DataLoader(self.train_dataset, batch_size = self.batch_size,
collate_fn = collate_fn, shuffle = True, num_workers = 2, pin_memory = True)
def val_dataloader(self):
return DataLoader(self.val_dataset, batch_size = self.batch_size,
collate_fn = collate_fn, shuffle = False, num_workers = 2, pin_memory = True)
I understand below code could select random of dataset but I want to train the other data for next epochs too.
df_fraction= df_mydataset.sample(frac=0.04)
And I understand below code could select random of dataset but I dont know how it works.Because I should change data for each 10 epochs
train_sampler = SubsetRandomSampler(train_indices)
train_loader = torch.utils.data.DataLoader(dataset, batch_size=2, sampler=train_sampler)
How can I do that with batch_size=2?
Aucun commentaire:
Enregistrer un commentaire