I'm trying to draw a random sample of rows without replacement from a dataset such that the sum of a column in the sample should be strictly within a range. For the example dataset mtcars, the random sample should be such that the sum of mpg is strictly within 90-100.
A reproducible example:
data("mtcars")
random_sample <- function(dataset){
final_mpg = 0
while (final_mpg < 100) {
basic_dat <- dataset %>%
sample_n(1) %>%
ungroup()
total_mpg <- basic_dat %>%
summarise(mpg = sum(mpg)) %>%
pull(mpg)
final_mpg <- final_mpg + total_mpg
if (final_mpg > 90 & final_mpg < 100){
break()
}
final_dat <- rbind(get0("final_dat"), get0("basic_dat"))
}
return(final_dat)
}
chosen_sample <- random_sample(mtcars)
But this function output samples with sum(mpg) > 100. How do I ensure that every sample it generates is strictly within that range? Any help is much appreciated.
Aucun commentaire:
Enregistrer un commentaire