I am trying to repeatedly add columns to a dataframe using random sampling from another dataframe.
My first dataframe with the actual data to be sampled from looks like this
df <- data.frame(cat = c("a", "b", "c","a", "b", "c"),
x = c(6,23,675,1,78,543))
I have another dataframe like this:
df2 <- data.frame(obs =c(1,2,3,4,5,6,7,8,9,10),
cat=c("a", "a", "a", "b", "b", "b", "c","c","c", "c"))
I want to add 1000 new columns to df2 that randomly samples from df, grouped by cat. I figure out a (probably very amateurish) way of doing this once, by using slice_sample() to make a new dataframe sample1 with a random sample of df, and then merging sample1 with df2.
df <- df %>%
group_by(cat)
df2 <- df2 %>%
group_by(cat)
sample1 <- slice_sample(df, preserve = T, n=3, replace = T )
sample1 <- sample1 %>%
ungroup() %>%
mutate(obs=c(1:9)) %>%
select(-cat)
df3 <- merge(df2,sample1, by= "obs")
Now, I want to find a way to repeat this 1000 times, to end up with df3 with 1000 columns (x1,x2,x3 etc.)
I have looked into repeat loops, but haven't been able to figure out how to make the above code work inside the loop.
Aucun commentaire:
Enregistrer un commentaire