vendredi 9 août 2019

How to generate random categorical data in python according to a probability distribution?

I am trying to generate a random column of categorical variable from an existing column to create some synthesized data. For example if my column has 3 values 0,1,2 with 0 appearing 50% of the time and 1 and 2 appearing 30 and 20% of the time I want my new random column to have similar (but not same) proportions as well

There is a similar question on cross validated that has been solved using R. https://stats.stackexchange.com/questions/14158/how-to-generate-random-categorical-data. However I would like a Python Solution for this




Aucun commentaire:

Enregistrer un commentaire