jeudi 3 mai 2018

i have duplicate when i am using the random

I have duplicate when i am using this code, what i am doing wrong in the random?

data=data[data["VN"]>=1000]
data_T1=data[data["TARGET"]==1]
data_T0=data[data["TARGET"]==0]
data_T0_random=data_T0.loc[np.random.choice(data_T0.index, 10000)]
data=data_T1.append(data_T0_random)
print('q:',len(data.index))
rr=data.drop_duplicates()
print('qq:',len(rr.index))




Aucun commentaire:

Enregistrer un commentaire