jeudi 28 juillet 2016

How to Prevent Triplicates in Pandas DataFrame

I have the following code:

stim_df = pd.concat([block1,block2,bloc3,block4], axis=0, ignore_index=True).sample(frac=1).reset_index(drop=True)
stim_df.columns = ["Prime","Target","Condition"] 

#Check for triplicates: 
for j in xrange(len(stim_df)):
    if j == 0 or j == 1:
        pass
    else:
        if stim_df["Condition"][j] == stim_df["Condition"][j-1] == stim_df["Condition"][j-2]:
            stim_df[j-2:j+3] = stim_df[j-2:j+3].reindex([j-2,j-1,j+2,j,j+1])

What I'm trying to prevent from happening is three adjacent rows with the same "Conditions" value appearing together. So if my conditions are "1","2",and "3", I want to prevent an order like 1,1,2,2,2,1,3,1 from occurring, where the condition value 2 appears three times in a row.

My code doesn't solve the issue. Would it be better to create a pseudo-randomization function, rather than trying to deal with this after I've already randomly mixed the dataframe? Any assistance or suggestions would really help.




Aucun commentaire:

Enregistrer un commentaire