mercredi 23 juin 2021

How to sample a dataframe using a dataframe as weights with pandas

I want to sample rows from each columns of a dataframe according to a dataframe of weights. All columns of the dataframe of weights sum to 1.

A=pd.DataFrame([[1,2,3],[4,5,6],[7,8,9]])
w=pd.DataFrame([[0.2,0.5,0.3],[0.1,0.3,0.6],[0.4,0.5,0.1]])
sampled_data = A.sample(n=10, replace=True, weights=w)

But this code yields the following error

ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Obviously I would like the first column of A sampled according to the weights from the first column of w and so on.

With the solution like this:

sampled_data =
  1 2 3
0 2 6 8
1 2 5 7
2 3 4 8
. .....
9 1 6 9



Aucun commentaire:

Enregistrer un commentaire