Starting from this simple dataframe df
:
df = pd.DataFrame({'c':[1,1,2,2,2,2,3,3,3], 'n':[1,2,3,4,5,6,7,8,9], 'N':[1,1,2,2,2,2,2,2,2]})
I'm trying to select N
random value from n
for each c
. So far I managed to groupby and get one single element / group with:
sample = df.groupby('c').apply(lambda x :x.iloc[np.random.randint(0, len(x))])
that returns:
N c n
c
1 1 1 2
2 2 2 4
3 2 3 8
My expected output would be something like:
N c n
c
1 1 1 2
2 2 2 4
2 2 2 3
3 2 3 8
3 2 3 7
so getting 1 sample from c=1 and 2 samples for c=2 and c=3, according to the N
column.
Aucun commentaire:
Enregistrer un commentaire