Suppose I have a Pandas dataframe, df, which has the following structure:-
Column 1 Column 2 .... Column 100
Row 1 0.233 0.555 0
Row 2 0.231 0.514 2
..
Row 15000 0.232 0.455 3
Column 100 represents a particular class each row belongs to (which can be from 0-14). Each category/class has 1000 rows associated to it. For each category (denoted by integers in Column 100), I only want to select 200 samples randomly, and create a new dataframe df_new which will have a new dimension of 15x200 = 3000 rows. Any good way to reach this?
Aucun commentaire:
Enregistrer un commentaire