Thanks for looking at this question! My problem is this. I am attempting to create a neural network for pattern recognition. My problem is that I have about 1% 'good data' (i.e. data that has a target value of 1) and the other 99% 'bad data' (target value of 0). Currently I am sampling both data sets 50/50 over n number of iterations. My average cost does converge, although only to about 19%. My question is the following, is this a sensible way in which to sample very unbalanced data sets? If so, is an average cost error of about 19% far from random via the way I sampled the data?
If anything isn't clear please let me know!
Cheers!
Aucun commentaire:
Enregistrer un commentaire