I am trying to create a new data frame by randomly sampling an existing data frame. Specifically, I want create a data frame that is the same size as the original data frame, but each column of the new data frame is a random sample (with replacement) of the corresponding column in the original data frame. My first attempt looked like this:
# Create toy data set
data.set <- as.data.frame(matrix(1:50, ncol = 5))
# Change names
colnames(data.set) <- c("Stuff", "Things", "Foo", "Bar", "Guff")
# Try to create randomly sampled data frame
data.set %>% sample_n(replace = TRUE, size = nrow(data.set))
The problem here is that it just randomly samples rows, but not elements within each column individually. For example, here is some output.
Stuff Things Foo Bar Guff
2 2 12 22 32 42
10 10 20 30 40 50
2.1 2 12 22 32 42
3 3 13 23 33 43
5 5 15 25 35 45
3.1 3 13 23 33 43
8 8 18 28 38 48
9 9 19 29 39 49
1 1 11 21 31 41
6 6 16 26 36 46
Notice that the first and third rows are exactly the same, as are the fourth and sixth rows. What I would like is for each and every column to be randomly sampled independently. So, I tried this.
apply(data.set, MARGIN = 2, sample_n, replace = TRUE, size = nrow(data.set))
which produced the following error:
Error: Don't know how to sample from objects of class integer
although, I don't see what I did incorrectly. Can anyone offer a concise way of achieving my goal?
Aucun commentaire:
Enregistrer un commentaire