I would like to know if there's an efficient way of sampling for groups, choosing an integer and/or proportion to sample from them. I am aware of the existence of sample_n and that it works with grouped dfs but as far as I know it samples the same number for each group.
A minimal description of the problem, on a simple case, would be to sample, from the dataframe mtg, 5 random rows (or vector of indexes of those rows) for cyl == 4, 7 for cyl == 6 and 3 for cyl == 8.
Thanks,
Aucun commentaire:
Enregistrer un commentaire