I need a DataFrame
with r
rows and dynamic number of columns(based on groups). Input count
column specifies how many True
values are expected in the new DataFrame
. My current implementation creates a temporary DataFrame
with a single row containing a True
value for each group
in df, and then explode()
's that temporary dataframe. Finally, it groups by count
and aggregates to result df
input
--
| group | count | ...
| A | 2 |
| B | 0 |
| C | 4 |
| D | 1 |
And i need to fill new DataFrame
with this values randomly (c
-(columns) value is dynamic same as names)
expected output
--
A | B | C | D |
---|---|---|---|
NaN | NaN | True | True |
True | NaN | True | NaN |
NaN | NaN | NaN | NaN |
NaN | NaN | True | NaN |
True | NaN | True | NaN |
I think it's possible to add a randomized set of length from 1 to r
and after expanding and etc. just agg(sum) by this values.
my code
--
inputs = [
{"group": "A", "count": 2},
{"group": "B", "count": 0},
{"group": "C", "count": 4},
{"group": "D", "count": 1},
]
df = pd.DataFrame(inputs)
def expand(count:int, group: str) -> pd.DataFrame:
"""expands DF by counts"""
count = int(round(count))
df1 = pd.DataFrame([{group: True}])
# I'm thinking here i need to add random seed
df1 = df1.assign(count = [list(range(1, count+1))])\
.explode('count')\
.reset_index(drop=True)
return df1
def creator(df: pd.DataFrame) -> pd.DataFrame:
"""create new DF for every group value(count)"""
dfs = [expand(r, df['group'].values[0]) for r in list(df['count'].values)]
df = pd.concat(dfs, ignore_index=True)
return df
df.groupby('group', as_index=False)\
.apply(creator)\
.drop('count', axis=1)\
# and groupby my seed
.groupby(level=1)\
.agg(sum)
I tried to declare my questions if it will be helpful:
- Is there any method in pandas to make this easy/better?
- How can I make random counts and assign them in the
expand()
function? - Is it a way to create sized
DataFrame
withNaN
and then just drop there my values randomly(likepd.where
or something)?
PS: This is my first time asking a question, so I hope I have provided enough information!
Aucun commentaire:
Enregistrer un commentaire