I have a data frame as shown below.
ID
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Which has only one column ID and 20 unique values. randomly, I want to pick 25% of the unique values of column ID and create a new column OWNER_ID by randomly populating that across 20 rows with 10% missing (2 rows).
The randomly picked ID and Owner_ID should match.
For example randomly I picked 2,3,8,9,11
The expected output:
ID OWNERD_ID
1 2
2 2
3 3
4 11
5 9
6 11
7 11
8 8
9 9
10 2
11 11
12 2
13 na
14 8
15 9
16 8
17 9
18 2
19 2
20 na
I just don't know how start for this. So I did not tried anything. I am just learning random data generation using pandas.
Aucun commentaire:
Enregistrer un commentaire