mercredi 30 juin 2021

Take random samples based on two groups in r

I have data with two columns; ID and specialty. Each ID has two occurrences with unique occurrences of specialty. I would like to take random sample of data with 400 from each specialty group with 2 occurrences of ID. I have tried dplyr group with sample_n, but it turns ID with different occurrences.

data example

specialty <- c("obs", "obs", "ped", "ped", "im", "im")
ID <- c("M", "M", "K", "K", "l", "l")

My desired output if I would sample 2 per specialty is

specialty <- c("obs", "obs", "im", "im")
ID <- c("M", "M", "l", "l")

What I get is

specialty <- c("obs", "obs", "im", "im")
ID <- c("M", "M", "l", "k")



Aucun commentaire:

Enregistrer un commentaire