samedi 26 septembre 2020

Why does random.seed( ) does not work in generating dataset?

I'm creating dataset for testing with

import random
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split

random.seed(10)
X, y = make_regression(n_samples = 1000, n_features = 10)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.3, random_state = 0)
X[0:2]

Could you please explain why I get a different dataset after each running? For example, 2 runs return

array([[-0.28058959, -0.00570283,  0.31728106,  0.52745066,  1.69651572,
        -0.37038286,  0.67825801, -0.71782482, -0.29886242,  0.07891646],
       [ 0.73872413, -0.27472164, -1.70298606, -0.59211593,  0.04060707,
         1.39661574, -1.25656819, -0.79698442, -0.38533316,  0.65484856]])

and

array([[ 0.12493586,  1.01388974,  1.2390685 , -0.13797227,  0.60029193,
        -1.39268898, -0.49804303,  1.31267837,  0.11774784,  0.56224193],
       [ 0.47067323,  0.3845262 ,  1.22959284, -0.02913909, -1.56481745,
        -1.56479078,  2.04082295, -0.22561445, -0.37150552,  0.91750366]])



Aucun commentaire:

Enregistrer un commentaire