samedi 22 février 2020

About mutual_info_classif in sklearn, so confused about usage of random_state in terms of feature selection

I used mutual_info_classif and selectPercentile from sklearn to do the feature selection in a dataset. I found that I can set the random_state to be 0 to make sure selected features can be same every time such like below the code:

mi = mutual_info_classif(X_train, y_train, random_state=0)
print(mi)
sel_mi = SelectPercentile(mutual_info_classif, percentile=10).fit(X_train,y_train)

Another one, I do not need to set random_state and make it be default one. But this will make every selection will be different.

mi = mutual_info_classif(X_train, y_train)

I want to know that if the feature selection every time will be same, how can I judge if it is the best feature choices?

If the selection is different every time, whether does it mean that this kind of feature selection is meaningless?




Aucun commentaire:

Enregistrer un commentaire