lundi 19 août 2019

Randomly select hour from dataframe

I have a hard time to randomly select rows from a dataframe. In general, choosing one row is not a problem using np.random.choice(data,size=1000). I assume that replacement=True. However, I need to randomly select an hour and as output, recieve the 4 rows of each quarter.

The dataframe to choose from is the following (1132 rows):

data=
                     Price  Consume    Feed
StartTime                                  
2018-07-04 02:00:00  45.80    67.91   67.91
2018-07-04 02:15:00  45.80    51.05   51.05
2018-07-04 02:30:00  45.80    46.12   46.12
2018-07-04 02:45:00  45.80    46.86   46.86
2018-07-11 05:00:00  43.80    43.49   43.49
2018-07-11 05:15:00  43.80    50.71   50.71
2018-07-11 05:30:00  43.80    48.19   48.19
2018-07-11 05:45:00  43.80    40.02   40.02

My desired output is something like this:

Assuming the random generator has "selected" 2018-07-11 05:00:00, the output would be

2018-07-11 05:00:00  43.80    43.49   43.49
2018-07-11 05:15:00  43.80    50.71   50.71
2018-07-11 05:30:00  43.80    48.19   48.19
2018-07-11 05:45:00  43.80    40.02   40.02

Is it possible to randomly select an dayhour directly from the dataframe and repeat this 1000 times? I am afraid that using an extra dataframe to select an hour and then looking the corresponding values up in the original dataframe will be too time consuming. I am confident that this should be doable in Python, but I couldn`t find any tips on this.

Thanks for any help!




Aucun commentaire:

Enregistrer un commentaire