I have a data frame as shown below. Which is data of people who stays in an area.
ID Nationality Age
1 India 38
2 China 45
3 USA 78
4 China 12
5 Pakistan 48
6 India 10
7 India 71
8 India 16
9 China 36
10 China 31
11 USA 82
12 Pakistan 3
13 Pakistan 36
14 India 26
15 USA 52
16 China 26
17 China 5
18 USA 4
19 Pakistan 24
20 Pakistan 85
In the above dataframe I would like to add one more column as 'Owner_ID'.
Conditions:
1. Pick random 20% ID whose Nationality == India or Pakistan whose age is
20 < age 70 and age Tag them as No_Owner (Here ID=14 Nationality = India and Age 26 tagged as No_Owner, similarly ID = 19).
2. Owner_ID should be one of the ID (1-20) and their Nationality should match each other
3. The Age of the Owner_ID should be in between 30 to 50 for country other than USA.
4. If the Nationality is USA, age can be 25 to 85
5. Percentage of Owner_ID from USA should be more than 20%
The Expected Output:
ID Nationality Age Owner_ID
1 India 38 1
2 China 45 2
3 USA 78 15
4 China 12 9
5 Pakistan 48 5
6 India 10 1
7 India 71 1
8 India 16 1
9 China 36 2
10 China 31 2
11 USA 82 11
12 Pakistan 31 5
13 Pakistan 36 5
14 India 26 No_Owner
15 USA 52 15
16 China 26 9
17 China 5 9
18 USA 4 15
19 Pakistan 24 No_Owner
20 Pakistan 85 5
Aucun commentaire:
Enregistrer un commentaire