I have a dataframe with the following structure:
dat <- tibble(
item_type = rep(1:36, each = 6),
condition1 = rep(c("a", "b", "c"), times = 72),
condition2 = rep(c("y", "z"), each = 3, times = 36),
) %>%
unite(unique, item_type, condition1, condition2, sep = "-", remove = F)
which looks like this:
# A tibble: 216 × 4
unique item_type condition1 condition2
<chr> <int> <chr> <chr>
1 1-a-y 1 a y
2 1-b-y 1 b y
3 1-c-y 1 c y
4 1-a-z 1 a z
5 1-b-z 1 b z
6 1-c-z 1 c z
7 2-a-y 2 a y
8 2-b-y 2 b y
9 2-c-y 2 c y
10 2-a-z 2 a z
I would like to take a random sample of 36 rows. The sample should include 6 repetitions of the condition1
by condition2
combinations without repeating item_type
.
Using slice_sample()
it seems I can get the subset I want...
set.seed(1)
dat %>%
slice_sample(n = 6, by = c("condition1", "condition2")) %>%
count(condition1, condition2)
condition1 condition2 n
<chr> <chr> <int>
1 a y 6
2 a z 6
3 b y 6
4 b z 6
5 c y 6
6 c z 6
But on closer inspection I see that item_type
is repeated.
set.seed(1)
dat %>%
slice_sample(n = 6, by = c("condition1", "condition2")) %>%
count(item_type) %>%
arrange(desc(n))
# A tibble: 22 × 2
item_type n
<int> <int>
1 10 3
2 34 3
3 1 2
4 6 2
5 7 2
6 15 2
7 20 2
8 21 2
9 23 2
10 25 2
# … with 12 more rows
In other words, I would like only unique pulls from item_type
. Is it possible to get slice_sample()
to do this?
Aucun commentaire:
Enregistrer un commentaire