I have a dataset of around 50 contiguous days. I want to divide it in training and test data set in such a way that each week's 5 days are for the training and 2 days are in the test set.
The 2 days of the test set should be selected randomly any 2 days of the week. Like not always e.g. 1st 2 days are selected.
How to do that ?
Is there any function fir it in r? Currently this is how I am diving data into training and test set but it's probably doing such that test and train data times are very near to each other so always very high MSR value resulting.
dataset1
set.seed(100)
train <- sample(nrow(dataset1), 0.7*nrow(dataset1), replace = FALSE)
TrainSet <- dataset1[train,]
#scale (TrainSet, center = TRUE, scale = TRUE)
ValidSet <- dataset1[-train,]
#scale (ValidSet, center = TRUE, scale = TRUE)
summary(TrainSet)
summary(ValidSet)
Aucun commentaire:
Enregistrer un commentaire