mardi 7 mars 2023

How to generate a pandas dataframe column with pre-determined values

I have created a pandas dataframe using this code:

import pandas as pd
import numpy as np

ds = {'col1':[1,1,1,1,1,0,0,0]}

df = pd.DataFrame(data=ds)

The dataframe looks like this:

print(df)
   col1
0     1
1     1
2     1
3     1
4     1
5     0
6     0
7     0

I need to create a new column, called col2, subject to these conditions:

  1. when col1 = 1, then we have 5 records. For 3 of those records, col2 must be equal to 2 and for the remaining 2 records col2 must be equal to 3. The location of the 2's and 3's is random.

  2. when col1 = 0, then we have 3 records. For 2 of those records, col2 must be equal to 5 and for the remaining record col2 must be equal to 6. The location of the 5's and 6 is random.

The resulting dataframe would look as follows (obviously the location of the values in col2 is random, so when you try to solve this you might get different record location, but the proportion of the values in col2 should meet the conditions specified above):

enter image description here

Does anyone know how to do this in python?




Aucun commentaire:

Enregistrer un commentaire