vendredi 24 avril 2020

Random sampling row in pandas using probability column

I am using python with pandas to draw random samples from a dataframe. My dataframe looks like this:

Column one contains time, second one is an average rate, third is the 1-sigma and the fourth column is the probability associated with the event described by the row.

enter image description here

I know that I can use this code to draw weighted samples:

random=df.sample(n=100000, replace=True, weights='P>0', axis=0)

But I am not sure that a probability is the correct "weight" to use here. In short, I need that a value with low P>0 is sampled less frequently than a value with P>0.

Is anyone willing to share opinions / different options on this?

Thank you!




Aucun commentaire:

Enregistrer un commentaire