vendredi 26 avril 2019

Create new column filled with random elements based on a categorical column

I have a pandas dataframe that looks like this:

ID  Cat
87    A 
56    A 
67    A  
76    D  
36    D 

Column ID has unique integers, while Cat contains categorical variables. Now I would like to add two new columns with conditions about Cat.

The desirable result should look like this:

ID  Cat  New1   New2
87    A    67    36
56    A    67    76
67    A    56    36
76    D    36    56
36    D    76    67

Column New1: for each row, pick a random ID with the SAME category as the current row ID, with replacements.

Column New2: for each row, pick a random ID with a DIFFERENT category than the current row ID, with replacements.

How can I do this efficiently?




Aucun commentaire:

Enregistrer un commentaire