vendredi 7 juin 2019

How to randomly select rows from Pandas dataframe based on a specific condition?

Suppose I have a Pandas dataframe, df, which has the following structure:-

         Column 1      Column 2 ....     Column 100
Row 1    0.233           0.555              0
Row 2    0.231           0.514              2
..
Row 15000    0.232           0.455          3

Column 100 represents a particular class each row belongs to (which can be from 0-14). Each category/class has 1000 rows associated to it. For each category (denoted by integers in Column 100), I only want to select 200 samples randomly, and create a new dataframe df_new which will have a new dimension of 15x200 = 3000 rows. Any good way to reach this?




Aucun commentaire:

Enregistrer un commentaire