lundi 16 avril 2018

How to compae mean and sd of a randomly generated data frame using pandas

I am attaching a screen shot of an excel file. It contains x,y in two columns. I found a way to get number of functional loops present, meaning (x>x)=a,(x>y, y>x)=b, (x>y,x>y1)=c, (x>y,y>x1)=d. I generated 1000 random networks with the unique x and y values from the dataframe num_nodes and created a new dataframe (random_generated_network). What i want is to compare the mean and standard deviation of the functional loops in the new_a,new_b,new_c,new_d to a,b,c,d. Any help will be appreciated.

import pandas as pd
import numpy as np
df = pd.read_csv('/home/amit/Desktop/playing_with_pandas.csv')
a = (df.x == df.y).sum()
b = df.duplicated().sum()
c = (df.y.groupby(df.x).nunique() > 1).sum()
d = (df.x.groupby(df.y).nunique() > 1).sum()
num_nodes = df.drop_duplicates(subset='x', keep="last")
def simulate_df(num_nodes, size_of_simulated_df):
    return pd.DataFrame({'x':np.random.choice(num_nodes.x, size_of_simulated_df),
                     'y':np.random.choice(num_nodes.y, size_of_simulated_df)})
random_generated_network = simulate_df(num_nodes,1000)
dict_of_dfs = {}
for i in range(1000):
    random_generated_network = dict_of_dfs['df'+str(i)] = simulate_df(num_nodes, len(num_nodes))

new_a = (random_generated_network.x == random_generated_network.y).sum())
new_b = random_generated_network.duplicated().sum()
new_c = (random_generated_network.y.groupby(df.x).nunique() > 1).sum()
new_d = (random_generated_network.x.groupby(df.y).nunique() > 1).sum()

playing_with_pandas




Aucun commentaire:

Enregistrer un commentaire