samedi 21 mars 2020

Generate random variables from a probability distribution

I have extracted some variables from my python data set and I want to generate a larger data set from the distributions I have. The problem is that I am trying to introduce some variability to the new data set while maintaining the similar behaviour. This is an example of my extracted data that consists of 400 observations:

Value    Observation Count     Ratio of Entries
1        352                    0.88
2        28                     0.07
3        8                      0.02
4        4                      0.01
7        4                      0.01
13       4                      0.01

Now I am trying to use this information to generate a similar dataset with 2,000 observations. I am aware of the numpy.random.choice and the random.choice functions, but I do not want to use the exact same distributions. Instead I would like to generate random variables (the values column) based from the distribution but with more variability. An example of how I want my larger data set to look like:

Value         Observation Count        Ratio of Entries
1             1763                     0.8815
2             151                      0.0755
3             32                       0.0160
4             19                       0.0095
5             10                       0.0050
6             8                        0.0040
7             2                        0.0010
8             4                        0.0020
9             2                        0.0010
10            3                        0.0015
11            1                        0.0005
12            1                        0.0005
13            1                        0.0005
14            2                        0.0010
15            1                        0.0010

So the new distribution is something that could be estimated if I fitted my original data with an exponential decay function, however, I am not interested in continuous variables. How do I get around this and is there a particular or mathematical method relevant to what I am trying to do?




Aucun commentaire:

Enregistrer un commentaire