jeudi 2 juillet 2020

Pyspark - how to generate random numbers within a certain range of a column value?

Initially I wanted to generate random integers between two numbers (10 and 80):

from random import randint
df.fillna(randint(10, 80), 'score').show()

What will be a correct way to generate random decimals within a certain range of a current column's value? For example, random decimals within +/- 15% of a 'score' column with a value 25.0?

I've looked into the documentation but there are only examples showing how to generate random numbers with seed. Not sure that it is suitable in this case.




Aucun commentaire:

Enregistrer un commentaire