jeudi 23 juillet 2020

pandas How to assign random values from list to new column that doesn't exist in another column of the same row?

I have a data set about 50k~ rows that has a certain Job ID and the User ID of the person that performed the job. It is represented by this sample I've created:

df = pd.DataFrame(data={
    'job_id': ['00001', '00002', '00003', '00004', '00005', '00006', '00007', '00008', '00009', '00010', '00011', '00012', '00013', '00014', '00015'],
    'user_id': ['frank', 'josh', 'frank', 'jessica', 'josh', 'eric', 'frank', 'josh', 'eric', 'jessica', 'jessica', 'james', 'frank', 'josh', 'james']
})


    job_id  user_id
0   00001   frank
1   00002   josh
2   00003   frank
3   00004   jessica
4   00005   josh
5   00006   eric
6   00007   frank
7   00008   josh
8   00009   eric
9   00010   jessica
10  00011   jessica
11  00012   james
12  00013   frank
13  00014   josh
14  00015   james

I wish to assign peer reviewers for those jobs in a new column called 'reviewer_id', where the reviewer is from the list of user_id's but the cannot be the same user_id. For example: frank can't review his own job, but jessica can.

My desired output would be something like this:

    job_id  user_id reviewer_id
0   00001   frank   jessica
1   00002   josh    frank
2   00003   frank   josh
3   00004   jessica eric
4   00005   josh    james
...
11  00012   james   frank
12  00013   frank   josh
13  00014   josh    eric
14  00015   james   eric

I'm quite new to python so I can only think of getting a list of unique user_id from reviewers = df['user_id'].unique().tolist() and iterating over the dataframe and assigning a reviewer ID but I know you should typically never iterate over a pandas dataframe. So I'm lost on how I would go about something like this.




Aucun commentaire:

Enregistrer un commentaire