lundi 31 août 2020

Selecting from pandas groups without replacement when possible

Say that I have a Dataframe that looks like:

Name Group_Id
A    1
B    2
C    2

I want a piece of code that selects n sets such that, as much as possible would contain different members of the same group. A representative from each group must appear in each set (the representatives should be picked at random). Only if the group's size is smaller than n, the same representatives would appear in multiple sets. n is smaller or equal to the size of the biggest group. So for example, for the above Dataframe and n=2 this would be a valid result:

set 1 
Name Group_Id
A    1
B    2

set 2 
Name Group_Id
A    1
C    2

however this one is not

set 1 
Name Group_Id
A    1
B    2

set 2 
Name Group_Id
A    1
B    2



Aucun commentaire:

Enregistrer un commentaire