lundi 30 mars 2015

random sampling with or without replacement in postgresql

I am looking for the possible ways of random sampling in postgreql, I found couple of methods to do that with different advantages and disadvantages,the naive way to do that is :



select * from Table_Name
order by random()
limit 10;


another faster method is:



select * from Table_Name
WHERE random()<= 0.01
order by random()
limit 10;


(although that 0.01 depends on the table size and the sample size,this is just an example).



in both of these queries a random number is generated for each row and sorted based on those random generated numbers and then in the sorted numbers the first 10 is selected as the final result,so I think these should be sampling without replacement.



now what I want to do is to somehow turn this sampling methods into sampling with replacement,how is that possible? or is there any other random sampling method with replacement in postgresql?





Aucun commentaire:

Enregistrer un commentaire