I have following data:
group_id id name
---- -- ----
G1 1 apple
G1 2 orange
G1 3 apple
G1 4 banana
G1 5 apple
G2 6 orange
G2 7 apple
G2 8 apple
G3 7 banana
G3 8 orange
I want to update 1 random record of each group with 1, rest everything should be zero, like this:
group_id id name random_pick
---- -- ---- -------------------
G1 1 apple 0
G1 2 orange 0
G1 3 apple 0
G1 4 banana 0
G1 5 apple 1
G2 6 orange 0
G2 7 apple 1
G2 8 apple 0
G3 7 banana 0
G3 8 orange 1
My thoughts:
- Add column with 0 as default value
- use Window.partitionBy("group_id"), then get count of each group, take random between 1 and the count, update the record to 1
But how in scala?! :(
Thanks in advance!
Aucun commentaire:
Enregistrer un commentaire