I am trying to use bootstrapping techniques to find the differences between two samples of two different populations (and this for three cases). I am coding it in R. I am sampling randomly the vector indices instead of using the boot function.
I have tried with different number of simulations and setting different seeds for the random function. Both of them seem to have some influence in the results, but in an unexpected way:
n.simul <- 100
set.seed(1)
coef_V1 <- data.frame(matrix(0, ncol = 6, nrow = n.simul))
for (i in 1:n.simul) {
indices1 <- sample(nrow(MZArrayV1), size=nrow(MZArrayV1), replace=TRUE)
indices2 <- sample(nrow(MZArrayV1_G), size=nrow(MZArrayV1_G), replace=TRUE)
isample1 <- MZArrayV1[indices1,]
isample2 <- MZArrayV1_G[indices2,]
(here I fit every both samples to two different parabolas and calculate the differences, which is shown in the histograms below)
}
Does anyone have any clue why this happens? I would expect a behavior like the ones with 100 simulations, or with 500 simulations and seed of 123456789. The other ones, with that one narrow peak for one of the distributions, are just weird.
Aucun commentaire:
Enregistrer un commentaire