Context
I have a dataframe where I need to remap a column to different values. For some values the mapping is ambiguous, the resulting value should be chosen randomly from a list everytime the value to be mapped is encountered.
For example, the values in the columns should be remapped in the following way:
- 1 ➝ 'a'
- 2 ➝ 'b' or 'c', chosen at random
- 3 ➝ 'd'
If there are two rows with a 2
, a random draw should be done each time to determine if the value should be mapped to b
or to c
.
Example data
Here is some example data:
import pandas as pd
df = pd.DataFrame({"col1": [1, 2, 3, 4, 5, 6, 7, 8], "col2": [2, 2, 2, 3, 1, 2, 2, 1]})
What I've looked into
I've tried using map
and a random.choice
call with a mapping dictionary (as described in this answer):
choice_list = ["b", "c"]
map_dict = {1: "a", 2: random.choice(choice_list), 3: "d"}
df["remap"] = df.col2.map(map_dict)
I found that in the remapping of value 2
, always a single value was chosen from the choice_list
for all rows, e.g. all b
's:
col1 col2 remap
0 1 2 b
1 2 2 b
2 3 2 b
3 4 3 d
4 5 1 a
5 6 2 b
6 7 2 b
7 8 1 a
Something similar happens when I use the replace
method.
My expected outcome would be something like:
col1 col2 remap
0 1 2 b
1 2 2 c
2 3 2 b
3 4 3 d
4 5 1 a
5 6 2 b
6 7 2 c
7 8 1 a
Aucun commentaire:
Enregistrer un commentaire