Normalization of MLP output

bzzhb · September 3, 2007, 7:32pm

Hi rooters,

I have some questions about the normalization of MLP output. I use 0 and 1 represent background and signal respectively, following the instructions in mlpHiggs.C. The neural network training goes well but the output can be slightly less than 0 and greater than 1. I want to constrain the output to [0,1] so that the result looks nicer. So I put a “@” preceding the output neuron in the constructor of MLP and did the training again. Then I found the output was normalized to [-1,1] rather than [0,1]. My first question is why isn’t the output normalized to [0,1]? Is there any way I can control how the output should be normlized?

What confused me more happened when I exported the NN as a function for further use. I use the NN function as follows

nnsel *nn = nnsel();
Double_t nn_val = nn->value(0,in0,in1,in2,in3,in4,in5);

But the distribution of computed nn_val was observed between [0.5,1], which looked very strange. So I checked the codes of class nnsel. The implementation of output neuron is

double nnsel::neuron0x1b663a0() {
   double input = input0x1b663a0();
   return (input * 0.5)+0.5;
}

which I guess attempts to normalize output from [-1,1] to [0,1]. And the implementation of nnsel::value() is

double nnsel::value(int index,double in0,double in1,double in2,double in3,double in4,double in5) {
   input0 = (in0 - 6.07017)/1.62633;
   input1 = (in1 - 0)/1;
   input2 = (in2 - 0)/1;
   input3 = (in3 - 0)/1;
   input4 = (in4 - 1292.67)/465.07;
   input5 = (in5 - 3272.29)/941.692;
   switch(index) {
     case 0:
         return ((neuron0x1b663a0()*0.5)+0.5);  
     default:
         return 0.;
   }
}

that does normalization again and finally the output is normalized from [0,1] to [0.5,1]. Maybe this is why nn_val was observed in [0.5,1].
If I simply modify

return ((neuron0x1b663a0()*0.5)+0.5);

to

return neuron0x1b663a0();

e.g. do not do the normalization a second time, then the output is between [0,1], just as I want. So I guess maybe there is some bug within TMultilayerPerceptron::Export() method when output normalization option is chosen(a ‘@’ is put preceded output neuron) if my above statement is right.

Any one can give me a hint how this is going on?

delaere · September 4, 2007, 7:36am

Thanks for reporting this. There is indeed a bug in the Export function in case of normalized outputs. The fix should appear in CVS soon.

Concerning your original problem, I would advise you to switch to the cross-entropy errors, “which allows to train a network for pattern classification based on Bayesian posterior probability” (from the class documentation).

Since your output is already 0 or 1, simply add a “!” at the end of your mlp formula. The mlp will produce an output strictly included in [0,1], that can be interpreted as a probability.

Note: the option to normalize input and/or output is there for technical reasons mainly: the training will be more easy if the order of magnitude of inputs is +/-1, and fits the natural range of a sigmoid function.

bzzhb · September 4, 2007, 5:26pm

Thanks for prompt reply. But there is still something I don’t understand clearly.

What do you mean by “switch to the cross-entropy errors”?

delaere · September 4, 2007, 6:40pm

Simply add a “!” at the end of your mlp formula, to have for example “a,b,c,d:8:3:output!”.

With that option, TMultiLayerPerceptron uses cross-entropy errors, which allows to train a network for pattern classification based on Bayesian posterior probability.
Reference: [Bishop 1995 , Neural Networks for Pattern Recognition], in particular chapter 6.

bzzhb · September 4, 2007, 6:47pm

I see. Thank you very much.

Haibing ZHANG