I want to add a new column (category
) whose values (a/b
) are random samples (without replacement) of the id
-column, but conditioned on the value (A/B
) in the group
-column. When trying to do so, however, the value in the id
column changes--I don't understand why this is happening.
set.seed(123)
df <- data.frame(id=LETTERS[1:10], group=sample(c("1","2"), size=10, replace=T))
df$category <- NA
> table(df$group)
1 2
6 4
df[df$id %in% sample(df[df$group=="1",]$id, size=4, replace=F),]$category <- "a"
df[df$id %in% sample(df[df$group=="2",]$id, size=2, replace=F),]$category <- "b"
> df
id group category
1 A 1 a
2 B 1 <NA>
3 B 1 a
4 D 2 b
5 E 1 <NA>
6 F 2 <NA>
7 G 2 <NA>
8 H 2 b
9 C 1 a
10 E 1 a
> df$id==LETTERS[1:10]
[1] TRUE TRUE FALSE TRUE TRUE TRUE TRUE TRUE FALSE FALSE
# this should be all TRUE
(Please feel free to edit title and question, if it is not expressed clearly enough)
Aucun commentaire:
Enregistrer un commentaire