lundi 30 août 2021

Unable to replicate sample_n results using constant seed across R versions

I'm trying to replicate the sampling results from a script made in early January 2021. At the time, I forgot to record the R version and dplyr version I was using to create the sample. Now I have reinstalled R with the newest version of R (4.1.1) and dplyr (1.0.7) but I can't replicate my sampling results. I know that earlier R versions might use different RNGs, so I've tried to use RNGversion() to try out my seed with all versions of R but to no avail. This is not entirely surprising because I recall having used at least R 3.6.0, after which there shouldn't have been changes to the default RNG.

rm(list=ls())
library(dplyr)
RNGversion("3.5.0")
set.seed(182508)

Are there any other factors besides the R version that could affect my randomization results? For example, changes in the dplyr function sample_n? I know that sample_n has been superseded by slice_sample, but sample_n is still usable in the newest version of dplyr.




Aucun commentaire:

Enregistrer un commentaire