lundi 18 novembre 2019

Python: Randomly select a subgroup in a group

I have a dataframe that looks like:

patient_id   note_id    lines
A              10         1
A              10         2
A              10         3
A              29         1
A              29         2 
B              12         1
B              95         1
B              95         2
B              95         3
C......
D...... 
E              14         1
E              55         1 
E              87         1
......

Each patient can have multiple notes and each note may contain more than 1 line. Say that I have 20 patients, 50 notes and 150 lines. How can I randomly select only one random note for randomly selected 3 patient? Say that I want one random note per randomly selected patient_id, I would get:

patient_id   note_id    lines
A              29         1
A              29         2 
B              12         1 
E              55         1



Aucun commentaire:

Enregistrer un commentaire