I’m working on a project in Root involving an MLP, and I’d like to initialize my network with a large number of different sets of randomized weights, to reduce the chance that the network gets stuck in any particular local error minimum. To this end, I’d like to construct a loop over, say, 1000 iterations, each of which would randomize the weights, train the network, get some measurement of the error, and then, if the error is smaller than whatever value my “error” variable currently holds, overwrite “error” with the new value and dump the weights that produced it into a file, overwriting whatever set of weights is already there. Thus, after a large number of iterations, I can hopefully single out the set of weights that produced the best results.
The part of this that I’m currently stuck on is what measurement of error to use, because I don’t quite understand what the different possible ways to measure error on a TMLP acutally mean. When I use the mlp->Train method with the “text” option, I get periodic updates on one measurement of the error for both the training and test datasets. In my case, both of these values end up being around .22 after 50 epochs. When I use the mlp->GetError(Int_t event) method on the final event in the dataset (I’m not actually sure which dataset, because they have the same number of events, and I can’t see any way to specify), it returns a value of approximately .03. And when I use the mlp->GetError(TMultiLayerPerceptron::EDataSet set) method on either kTraining or kTest, I get values around 250.
Because these three measurements of the error arrive at such very different results it’s evident that they are entirely different measurements to begin with. I understand what the error on a single event means and why it would produce a different value than either of the other measurements. But I’m not entirely sure what’s being measured by mlp->train and how it differs from what’s being measured by mlp->GetError on an entire dataset. Can somebody explain this to me?