mardi 18 août 2020

Pair two data frames with conditions

I have two data frames (a and b), each with only two columns: ID and SID. I want to make random pairs of IDs between the two data frames with one restriction: pairs cannot share SID. This is an example of what I want, but the real data contain thousands of IDs and many many shared SIDs. I take it I will use zip at some point. No clue what else to do...

edit: also, a solution must allow for data frames of unequal length (some IDs may go unpaired).

# The data

a = {'ID':['tom', 'nick', 'krish', 'jack'],
        'SID':['hal', 'pete', 'zen', 'bop']}
b = {'ID':['tim', 'sasha', 'alex', 'jose'],
        'SID':['hal', 'kora', 'zen', 'felix']}

# One possible output
print(ab)

tom sasha
nick tim
krish jose
jack alex



Aucun commentaire:

Enregistrer un commentaire