samedi 30 janvier 2021

Multi classification with Random Forest - how to measure the "stability" of results

I am using Random Forest (from sklearn) for a multi-classification problem, with ordered classes(say 0,...,n, with n=4 in my specific case) roughly equally distributed. I have many observations (roughly 5000) and I split them in train/test 70%/30% respectively - the classes are equally distributed also in train and test. I set random_state=None, so each time I re-run the fitting of the model (on the same training set) and then the prediction, I obtain slightly different results on my test set.

My question is how to measure if Random Forest is working well by comparing different predictions...

For example if I obtain as predictions first only 0 and then only n (where, as said, 0 and n are the most different classes), I would say that the RF is not working at all. On the contrary if only few predictions change from a class to a close one (e.g. first 0 and then 1), I would say RF is working well.

Is there a specific command to check this automatically?




Aucun commentaire:

Enregistrer un commentaire