Suppose I have a Pandas dataframe, df
, which has the following structure:-
Column 1 Column 2 .... Column 100
Row 1 0.233 0.555 0
Row 2 0.231 0.514 2
..
Row 15000 0.232 0.455 3
Column 100
represents a particular class each row belongs to (which can be from 0-14
). Each category/class has 1000
rows associated to it. For each category (denoted by integers in Column 100
), I only want to select 200
samples randomly, and create a new dataframe df_new
which will have a new dimension of 15x200 = 3000 rows
. Any good way to reach this?
Aucun commentaire:
Enregistrer un commentaire