dimanche 5 mars 2017

Why are the results inconsistent for sampling in R vs sampling in spark

I have a dataframe that I needed to get samples from and perform operations on in spark but even after setting a seed value, I'm not getting the exact same samples in each run. (How can I make my dataframe sample in each run using apache spark) whereas I get the exact same samples on the same dataframe every time when i do a sampling in R with a seed value.

Is this expected behavior and also is this acceptable?




Aucun commentaire:

Enregistrer un commentaire