jeudi 15 décembre 2022

How to obtain a random sample in order to their category?

I have a DF called 'listing_df', what I need is to compute the NA data from the variables 'number_of_reviews' and 'review_scores_rating' creating random samples according to the numbers of each group of 'room_type'.

I attach a picture of how the DF looks like:

listing_df.png

I tried first of all grouping by 'room_typeI:

test <- listings_df %>% group_by(room_type)

Then, I select the columns where I want to transform the Na data, and create the samples

test$number_of_reviews[is.na(listings_df$number_of_reviews)] <- 
  sample(listings_df$number_of_reviews, size = sum(is.na(listings_df$number_of_reviews)))

test$review_scores_rating[is.na(listings_df$review_scores_rating)] <- 
  sample(listings_df$review_scores_rating, size = sum(is.na(listings_df$review_scores_rating)))

I am not sure if it's createn the random data according the room_type, also I would like to know if it's possible to manage this creating a loop.

Thanks!




Aucun commentaire:

Enregistrer un commentaire