I need to select a especific number of rows per resampling from 4 to 50 rows.
So, in the first run I need that the function select 4 random rows and calculate the mean, variance and confidence intervals for a given variable and this has to be done 1000 times. The second run I need the same thing, but instead of selecting 4 rows I need that selct 5 random rows 1000 times... until 54 rows.
The example:
x1 <- matrix(rnorm(200,mean=10), nrow= 100, ncol=2)
x2 <- c(replicate(5, "AA"),replicate(15, "BB"),replicate(15, "CC"),
replicate(10, "DD"),replicate(10, "EE"),replicate(10, "FF"),
replicate(10, "GG"),replicate(5, "HH"),replicate(5, "II"),
replicate(15, "JJ"))
df <- data.frame(cbind(x1,x2))
colnames(df) <- c("variable1", "variable2","group")
I'm running these code below, manually, and it is seems that is right.
samples <- vector(mode="list", length=1000)
for (i in 1:1000){
samples[[i]]=sample(as.numeric(df$variable1),size=4,replace=F)
}
# funtionc to calculate confidence interval
conf <- function(x) {
error <- qnorm(0.975)*sd(x)/sqrt(length(x))
return (data.frame("lower" = mean(x)-error,
"upper" = mean(x)+error))
}
# calculating mean, variance and confidence interval of the simulations
mean1 <- lapply(samples,mean) # calculating the mean of these 4 select rows per simulation
mean2 <- unlist(mean1) # unlist the list of the means values
mean_4rows <- mean(mean2) # the total mean of the randomly selected rows
var1 <- lapply(samples,var) # calculating the var of these 4 select rows per simulation
var2 <- unlist(var1)
var_4rows <- var(var2) # the total variance of the randomly selected rows
conf1 <- lapply(samples,conf) # calculating the var of these 4 select rows per simulation
conf2 <- unlist(conf1)
conf_4rows <- conf(conf2) # the total conf interval of the randomly selected rows
However, I have to automate this code, to be able to run it so that I can select from 4 to 50 random rows (1000 times each number of rows selection) and calculate the mean, variance and CIs of the simulations.
In the end I would like a object with the total means, variance and CIs for the number of selected rows generated by the simulations,with the rows refering to the selection of 4 rows, and 5 selected random rows.... etc until 50 rows:
#> rows meanSim varianceSim lowerCISim upperCISim
#> 4 1.84 0.410 0.105 0.300
#> 5 1.69 0.951 1.023 2.098
#> 6 1.99 0.714 1.234 1.987
#> .....
#> 50 2.58 0.242 2.098 2.999
Any idea on how I can make this automated and save these results?
Thank you!
Aucun commentaire:
Enregistrer un commentaire