mercredi 21 août 2019

How do I stop the same 'random' rows being selected across different computers?

I have some code which requires several simulations and I am hoping to run across separate computers. Each simulation requires identifying a random subset of the data to then run the analyses on. When I try to run this on separate computers at the same time, I get notice that the same rows are selected for each simulation. So if I am running 3 simulations, each simulation will identify the same 'random' samples across separatae computers. I am not sure why this is, can anyone suggest any code to get around this?

I show the sample_n function in dplyr below, but the same thing happened using the 'sample' function in Base R. Thanks in advance.

library(dplyr)
explanatory <- c(1,2,3,4,3,2,4,5,6,7,8,5,4,3)
response <- c(3,4,5,4,5,6,4,6,7,8,6,10,11,9)

A <- data.frame(explanatory,response)
B <- data.frame(explanatory,response)
C <- data.frame(explanatory,response)

for(i in 1:3)
{
 Rand_A = sample_n(A,8)
 Rand_B = sample_n(B,8)
 Rand_C = sample_n(C,8)
 Rand_All = rbind(Rand_A, Rand_B,Rand_C)
}




Aucun commentaire:

Enregistrer un commentaire