mardi 17 janvier 2017

stratified random sample with counts in postgresql

I have the following table generated by a calculate_table() procedure:

table(
  id integer,
  type integer
)

I am looking to do a stratified random sample, where I choose a random id, randomized by type, as well as return the counts for type and counts for id.

So in the following example:

id,type
1,1
2,1
3,1
4,1
5,2
6,2
7,2
8,3
8,4

A randomization could choose the following:

chosen_type: 2
type_count: 4
chosen_id: 6
id_count: 3

So there would be a 25% chance of getting type 2, and if type 2 is chosen, there is 33% of getting id 6.

The following wouldn't work because it is randomly choosing from all id's, independent of their type, which isn't what I want.

select * from calculate_table()
order by random()
limit 1;

I am having trouble trying to avoid calling the calculate_table() procedure more than once and/or storing things in arrays. How can I go about this?




Aucun commentaire:

Enregistrer un commentaire