For a simulation, I would like to randomly generate 100 linear classifiers (i.e. lines). I was doing this with :
classifier_array=list(np.random.uniform(-1,1,(n_classifiers,n_dim+1)))
In other words, I try to implement w_1*x_1+w_2*x_2+...+b=0 by choosing the weights and bias randomly between -1 and 1.
But it seems that although the slopes seem to be well distributed, it's not the case for the intercepts (in dimension 2) which always seem to be close to 0. 
Code I used to plot :
def abline(slope, intercept):
"""Plot a line from slope and intercept"""
axes = plt.gca()
x_vals = np.array(axes.get_xlim())
y_vals = intercept + slope * x_vals
plt.plot(x_vals, y_vals, '--')
def plot_data(classifiers):
axes = plt.gca()
axes.set_xlim([-100,100])
axes.set_ylim([-100,100])
for i in range (0, len(classifiers)):
slope=-classifiers[i][0]/classifiers[i][1]
intercept=-classifiers[i][2]/classifiers[i][1]
abline(slope, intercept)
plt.show()
and thus I am just making
plot_data(classifier_array)
after generating my classifiers
1) Why are the biases so close to 0 ?
2) How could I do to get biases distributed in greater range ?
Actually, what I want to do is to linearly separate my data by generating randomly linear "classifiers" and choose the best with a designed algorithm. But when I have my data "shifted" around the right top corner, all the classifiers with negative slope are unuseful (see picture). My idea would be to center my data, whih should be maybe the best solution but I am first looking for a solution that could allow me to have greater intercepts
Aucun commentaire:
Enregistrer un commentaire