Setting a seed ensures reproducibility and is important in simulation modelling. Consider a simple model f()
with two variables y1
and y2
of interest. The outputs of these variables are determined by a random process (rbinom()
) and the parameters x1
and x2
. The outputs of the two variables of interest are independent of each other.
Now say we want to compare the change in the output of a variable after a change in the respective parameter has occurred with a scenario before the change was made (i.e. sensitivity analysis). If all other parameters have not been changed and the same seed was set, shouldn't the output of the unaffected variable remain the same as it is in the default simulation since this variable is independent of the other?
In short, why is the below output of variable y2
determined by parameter x2
changing after only a change in x1
occurs despite constant seed being set? One could just ignore the output of y2
and focus only on y1
, but in a larger simulation where each variable is a cost component of the total cost the change in an unaffected variable may become problematic when testing the overall sensitivity of a model after individual parameter changes have been made.
#~ parameters and model
x1 <- 0.0
x2 <- 0.5
n <- 10
ts <- 5
f <- function(){
out <- data.frame(step = rep(0, n),
space = 1:n,
id = 1:n,
y1 = rep(1, n),
y2 = rep(0, n))
l.out <- vector(mode = "list", length = n)
for(i in 1:ts){
out$step <- i
out$y1[out$y1 == 0] <- 1
out$id[out$y2 == 1] <- seq_along(which(out$y2 == 1)) + n
out$y2[out$y2 == 1] <- 0
out$y1 <- rbinom(nrow(out), 1, 1-x1)
out$y2 <- rbinom(nrow(out), 1, x2)
n <- max(out$id)
l.out[[i]] <- out
}
do.call(rbind, l.out)
}
#~ Simulation 1 (default)
set.seed(1)
run1 <- f()
set.seed(1)
run2 <- f()
run1 == run3 #~ all observations true as expected
#~ Simulation 2
#~ change in x1 parameter affecting only variable y1
x1 <- 0.25
set.seed(1)
run3 <- f()
set.seed(1)
run4 <- f()
run3 == run4 #~ all observations true as expected
#~ compare variables after change in x1 has occured
run1$y1 == run3$y1 #~ observations differ as expected
run1$y2 == run3$y2 #~ observations differ - why?
Aucun commentaire:
Enregistrer un commentaire