vendredi 17 novembre 2017

Selecting one episode per ID in long dataset

I have a long format dataset where each ID has multiple episodes, and multiple rows per episode. I would like to select at random just one episode per ID, and all its associated rows.

For example:

df <- data.frame(id = c(1,1,1,2,2,2,2), 
    episode = c(1,2,2,1,1,1,2))
> df
  id episode
 1  1       1
 2  1       2
 3  1       2
 4  2       1
 5  2       1
 6  2       1
 7  2       2

... And I want to be left with this dataset:

> df2
  id episode
1  1       2
2  1       2
3  2       1
4  2       1
5  2       1

Any help would be greatly appreciated!

E.




Aucun commentaire:

Enregistrer un commentaire