samedi 30 septembre 2017

Sampling rows in data frame with an empirical probability distribution of a variable

I have got a following problem.

Let's assume that we have a data frame with few variables. Morover one variable (var_A) is a probability score - its values ranges from 0 to 1. I want to sample rows from this data frame in a way that it will be more probable to pick a row with higher value of var_A - so I guess that I have to draw from an empirical distribution of var_A. I know how to implement edf function of var_A as it's suggested here but I have no idea how to use this distribution for sampling rows.

Can you please help me with this?

Thanks




Aucun commentaire:

Enregistrer un commentaire