Why using numpy.random.seed():
np.random.seed(40)
Does not garant me always the same train/test split with np.random.rand
?
msk = np.random.rand(len(df)) < 0.8
train = df[msk]
test = df[~msk]
First train
try:
0 12,886,167
1 12,777,434
2 14,054,459
3 14,520,707
4 12,618,535
...
Second train
try:
0 12,886,167
1 12,777,434
2 14,054,459
3 14,520,707
5 8,489,784
...
How to define the same np.random.rand
data seapration ?
Aucun commentaire:
Enregistrer un commentaire