I know how to take a random sample each group from a dataframe using sample_n or sample_frac in dplyr, which can go like this,
dataset %>%
group_by(user_id) %>%
sample_n(10)
However, I have a slightly different question. I want to take a random sample from the whole dataset. It should be as simple as this one,
sample_n(dataset,10)
But, because I have used group_by command on the dataset in a previous case, it seems the group_by still takes effect here. The second command is equivalent to the first here.
I wonder how can I remove the effect of group_by and get a random sample from the whole dataset?
Aucun commentaire:
Enregistrer un commentaire