mercredi 6 mars 2019

Randomly Choose Rows from Table - Python Pandas Read SQL

I have to choose randomly rows from a Post GRE Table within a given date-time range. They way I doing now is query the table within the date-time range and then randomly select the rows.(Please see below) This is becoming very inefficient in terms of querying as I have 10 GB of data within the range. Is there a better way to do this? Please advise

sp = pd.read_sql("SELECT * FROM table1 WHERE timestamp >= '"+sampling_start_date+"' and timestamp <= '"+sampling_end_date+"'", con)

random_subset = sp.sample(n=300)

Time Stamp format is as below

sampling_start_date = "2018-08-17 20:00:00"




Aucun commentaire:

Enregistrer un commentaire