mercredi 28 septembre 2016

sync randomization between R and Python

Are there off-the-shelf approaches in R and Python that will allow me to generate identical PRNG results?

Background: I am trying to migrate some code from R to Python. It contains an algorithm that uses sample without replacement.

I can validate the migrated code by either (simpler) show that it produces identical results on a small set of inputs or (more complex) show that the results converge given a larger set of inputs.

Let's say I wanted to take the simple route. Obviously, the following do not produce identical results, because there are different PRNGs and implementations of sampling functions under the hood.

# R
set.seed(1)
sample(0:9,3)
# returns vector 2 3 4

and:

# Python
import random
random.seed(1)
random.sample(range(10),3)
# returns list 2 1 4

I could just suck it up and write my own PRNG, which would be educational but would require more programmer time (on non-optimized code). If it came to that I would probably opt for the more complex validation.

Are there any alternatives to doing so using existing libraries? I would prefer not to introduce third-party dependencies (like Java API). Currently leaning toward rPython implementation but open to anything.




Aucun commentaire:

Enregistrer un commentaire