lundi 18 octobre 2021

The difference between "Subset" and "SubsetRandomSampler"

Recently, I am trying to solve K-fold Cross Validation problem by using 「Subset」and 「SubsetRandomSampler」method.

When I using the 「Subset」method, the first epoch accuracy for CIFAR10 dataset is 88% , However, when I using the「SubsetRandomSampler」method, first epoch accuracy for CIFAR10 dataset only got 16%. It really confuse my and I have no idea. Does anybody know about that? Thanks a lot.

The code for 「Subset」method:

 for fold,(trainLoader,valLoader) in enumerate(kf.split(trainSet)):

  trainSetBasic = torch.utils.data.Subset(trainSet, trainLoader)
  valSetBasic = torch.utils.data.Subset(trainSet, valLoader)

  dataloaders = {
  'train': DataLoader(trainSetBasic, batch_size=BATCH_SIZE, shuffle=True, num_workers=2),
  'val': DataLoader(valSetBasic, batch_size=BATCH_SIZE, shuffle=True, num_workers=2)
  }

The code for 「SubsetRandomSampler」method:

for fold,(trainLoader,valLoader) in enumerate(kf.split(trainSet)):

  train_subsampler = torch.utils.data.SubsetRandomSampler(trainLoader)
  val_subsampler = torch.utils.data.SubsetRandomSampler(valLoader)

  dataloaders = {
  'train': DataLoader(trainSet, batch_size=BATCH_SIZE, sampler=train_subsampler, shuffle=False, num_workers=2),
  'val': DataLoader(trainSet, batch_size=BATCH_SIZE, sampler=val_subsampler, shuffle=False, num_workers=2)
  }

And the code for other parts are all the same




Aucun commentaire:

Enregistrer un commentaire