I have to choose randomly rows from a Post GRE Table within a given date-time range. They way I doing now is query the table within the date-time range and then randomly select the rows.(Please see below) This is becoming very inefficient in terms of querying as I have 10 GB of data within the range. Is there a better way to do this? Please advise
sp = pd.read_sql("SELECT * FROM table1 WHERE timestamp >= '"+sampling_start_date+"' and timestamp <= '"+sampling_end_date+"'", con)
random_subset = sp.sample(n=300)
Time Stamp format is as below
sampling_start_date = "2018-08-17 20:00:00"
Aucun commentaire:
Enregistrer un commentaire