lundi 28 septembre 2020

Hive: random select different size of samples from group by

Is it possible to random select different sizes of samples from group by in Hive? For instance, I have a table like this

|-----------------|----------------|
|     ID          |     Value      |
|-----------------|----------------|
|    IKJ12        |    5           |
|-----------------|----------------|
|    IKJ12        |    9           |
|-----------------|----------------|
|    IKJ12        |    10          |
|-----------------|----------------|
|    IKJ09        |    7           |
|-----------------|----------------|
|    IKJ09        |    14          |
|-----------------|----------------|

I would like to randomly select two samples from ID = IKJ12 and one sample from ID = IKJ09. One possibility is like this.

|-----------------|----------------|
|     ID          |     Value      |
|-----------------|----------------|
|    IKJ12        |    5           |
|-----------------|----------------|
|    IKJ12        |    10          |
|-----------------|----------------|
|    IKJ09        |    14          |
|-----------------|----------------|



Aucun commentaire:

Enregistrer un commentaire