dimanche 3 septembre 2023

Take random sample from each column by group in R

I have a dataset with several thousand households and for each household the consumption of roughly 60 food items was recorded for one month of the year.

I want to generate annual food consumption for each household and food item for the remaining months of the year by repeatedly sampling (with replacement) from my data.

This is what my data looks like but with only 3 food items.

data <- data.frame(matrix(NA, nrow=120, ncol=5))
names(data) <- c("ID", "Month", "Apple", "Maize", "Bread")
data$ID <- 1:120
data$Month <- rep(1:12, each=10)
data$Apple <- data1$Month + 2 * runif(12) - 1   # Consumption is seasonal but random
data$Maize <- data1$Month + 2 * runif(12) - 1
data$Bread <- data1$Month + 2 * runif(12) - 1

data

I want to group the data by month and then take a random sample from each column (food item) to have data for each household and item for every month of the year. The result should look like the data frame below.

data2 <- data.frame(matrix(NA, nrow=1440, ncol=5))
names(data2) <- c("ID", "Month", "Apple", "Maize", "Bread")
data2$ID <- rep(1:120, each=12)
data2$Month <- rep(1:12)

data2



Aucun commentaire:

Enregistrer un commentaire