Here is a sample of the data
p <- structure(list(name = structure(1:5, .Label = c("Alice", "Bob",
"Charlie", "Dennis", "Earl"), class = "factor"), cohort = structure(c(3L,
3L, 2L, 2L, 1L), .Label = c("X", "Y", "Z"), class = "factor"),
group = structure(c(1L, 1L, 2L, 2L, 1L), .Label = c("A",
"B"), class = "factor"), var = c(1L, 2L, 1L, 3L, 4L)), .Names = c("name",
"cohort", "group", "var"), class = "data.frame", row.names = c(NA,
-5L))
that looks like
name cohort group var
1 Alice Z A 1
2 Bob Z A 2
3 Charlie Y B 1
4 Dennis Y B 3
5 Earl X A 4
and I need something like the following, based on the cohort column. I need to sample one row in each cohort (possibly randomly) so that I don't have multiple people belonging to the same cohort.
name cohort group var
2 Bob Z A 2
3 Charlie Y B 1
5 Earl X A 4
I can group_by cohort, but then I'm not sure how to proceed to create a new data frame with only the rows that I need.
Aucun commentaire:
Enregistrer un commentaire