I have 2 different data frames with same data structure 1. df1 with response 'Yes' (US states as Columns) 2. df2 with response 'No' (US states as Columns) I want to collect samples from both df and make 1 sample data frame of specified size. I want to keep the sample data-set balanced. For example, if I take sample from df1 and I get 50 obs from NY state then I want 50 random from df2.
I have made a function to take samples from df and shuffle them but unable to incorporate part 2
sample12<- function(df1,df2,size) {
a<-df1[sample(nrow(df1),size/2,replace = T),]
b<-df2[sample(nrow(df2),size/2,replace = T),]
s1<-bind_rows(a,b)
s2<-s1[sample(1:nrow(s1)),]
assign('s1',s2,.GlobalEnv)
}
Aucun commentaire:
Enregistrer un commentaire