mardi 18 juillet 2023

Sampling the rows randomly by the specific rules in R

I have dat1 and dat2 like this:

dat1<-data.frame(company=c("a","b","c"),
                 random=c(2,1,1),
                 prior=c(1,3,1))

dat2<-data.frame(company=c("a","a",'a',"b","b","a","a","c","c","c",
                           "b","b","b"),
                 peter=c(5,0,1,1,1,0,100,7,8,7,0,0,1),
                 turner=c(5,1,2,1,2,0,200,7,9,6,0,0,0),
                 austin=c(7,5,3,1,0,0,300,77,10,5,0,1,1),
                 label=c("random","random","prior","random","random","random","prior",
                         "random","random","prior","prior","prior","prior"))

dat2 is the original data and dat1 gives the guidline how to pick sample rows from dat2. For example, if you look at the dat1 , for company==a, random=2 and prior=1. It means that you have to find company==a in dat2 and pick 2 rows randomly form company==a with label==random. Also, you have to find company==a in dat2 and pick 1 row from company==a with label==prior . So the possible extracted sampled rows data should look like this:

data<-data.frame(company=c("a","a","a","b","b","b","b","c","c"),
                 peter=c(5,0,100,1,0,0,1,7,7),
                 turner=c(5,1,200,1,0,0,0,7,6),
                 austin=c(7,5,300,1,0,1,1,77,5),
                 label=c("random","random","prior","random","prior","prior","prior","random","prior"))

So my question is how to get data ?




Aucun commentaire:

Enregistrer un commentaire