vendredi 18 juin 2021

Efficient way to add sample information as new column to data set

I know how I can subset a data frame by sampling certain rows. However, I'm struggling with finding an easy (preferably tidyverse) way to just ADD the sampling information as a new column to my data set, i.e. I simply want to populate a new column with "1" if it is sampled and "0" if not.

I currently have this one, but it feels overly complicated. Note, in the example I want to sample 3 rows per group.

df <- data.frame(group = c(1,2,1,2,1,1,1,1,2,2,2,2,2,1,1),
                 var   = 1:15)

library(tidyverse)

df <- df %>%
  group_by(group) %>%
  mutate(sampling_info = sample.int(n(), size = n(), replace = FALSE),
         sampling_info = if_else(sampling_info <= 3, 1, 0))



Aucun commentaire:

Enregistrer un commentaire