mercredi 24 mars 2021

How to save optimal relative standard deviation data after n-iterations?

I have a table after randomization:

parameter1 parameter2 random
Q1 12 A3
L3 15 A4
K3 13 A1
O1 14 A2
N2 12 A1
L33 19 A3
O7 11 A4
E3 16 A2

I would like to calculate the count, mean, median, standard deviation and relative standard deviation based on parameter2 for n-interations, save all the iterations data, and pick the table with the most optimal relative standard deviation data. How do I do that?

What I tried so far?

def randomization(df):
    # do randomization

data = [randomization(df)for i in range(10)] # assuming 10 iterations

def statistical_analysis(data):
    stat_table = pd.DataFrame()
    stat_table['N'] = data.groupby('random')['parameter2'].count()
    stat_table['mean'] = data.groupby('random')['parameter2'].mean()
    stat_table['median'] = data.groupby('random')['parameter2'].median()
    stat_table['std'] = data.groupby('random')['parameter2'].std()
    stat_table['relative std'] = stat_table['std']/stat_table['mean']
    return stat_table

temp_list = []
for i in data:
    temp = statistical_analysis(i)
    temp_list.append(temp)

How do I select the optimal relative standard deviation table from the n-iterations automatically?




Aucun commentaire:

Enregistrer un commentaire