I have a dataFrame really similar to that, but with thousands of values :
import numpy as np
import pandas as pd
# Setup fake data.
np.random.seed([3, 1415])
df = pd.DataFrame({
'Class': list('AAAAAAAAAABBBBBBBBBB'),
'type': (['short']*5 + ['long']*5) *2,
'image name': (['image01']*2 + ['image02']*2)*5,
'Value2': np.random.random(20)})
I was able to find a way to do a random sampling of 2 values per images, per Class and per Type with the following code :
df2 = df.groupby(['type', 'Class', 'image name'])[['Value2']].apply(lambda s: s.sample(min(len(s),2)))
I got the following result :
I'm looking for a way to subset that table to be able to randomly choose a random image ('image name') per type and per Class (and conserve the 2 values for the randomly selected image.
Aucun commentaire:
Enregistrer un commentaire