jeudi 18 janvier 2018

Grouped random sample MS SQL Server

Lets say that I want to randomly sample a database with id of the following form

id 
A    
A
A
B
C
C
D
D
E
F

Instead of doing a completely random sample how would I sample 50% of the distinct groups. The difference is that a random sample would give me half of all of the records while a grouped sample would give me half of all of the distinct ids.

This is necessary because I would not want to partially sample an id. If an A is selected then all 3 As should be included in this kind of grouped random sample.

Lets say for example the group set {A,B,C} was randomly selected. The query would produce the following

id
A
A
A
B
C
C




Aucun commentaire:

Enregistrer un commentaire