I am trying to randomly sample 50% of the data for each of the group following Stratified random sampling from data frame. A reproducible example using mtcars dataset in R looks like below. What I dont understand is, the sample index clearly shows a group of gear labeled as '5', but when the index is applied to the mtcars dataset, the sampled data mtcars2 does not contain any record from gear='5'. What went wrong? Thank you very much.
> set.seed(14908141)
> index=tapply(1:nrow(mtcars),mtcars$gear,function(x){sample(length(x),length(x)*0.5)})
> index
$`3`
[1] 6 7 14 4 12 9 13
$`4`
[1] 12 7 8 4 6 5
$`5`
[1] 5 1
> mtcars2=mtcars[unlist(index),]
> table(mtcars2$gear)
3 4
12 3
Aucun commentaire:
Enregistrer un commentaire