A few questions about ROOT MLP package

Hello, I have a few questions about ROOT MLP package.
Maybe somebody who understands it better can answer it.

  1. What is the meaning of weights in a call to train NN?

    For example, after styling my program after the tutorial
    mlpHiggs.C, I define NN as:

    TMultiLayerPerceptron *mlp = new TMultiLayerPerceptron
    (“int1,int2,int3,int4,int5,npeaks:5:3:type”,
    “weight”,simu,“Entry$%2”,"(Entry$+1)%2");

    Now, when I set weight=1 for all events, everything is OK.
    I tried to set the weights in such a way that event that is
    more probable to be a signal has larger weight, but then
    the learing failed. Why? I guess, I misunderstand the
    meaning of “weight” variable?

  2. Can the weight be omitted? E.g. can I have a NN definition
    like

TMultiLayerPerceptron *mlp = new TMultiLayerPerceptron
(“int1,int2,int3,int4,int5,npeaks:5:3:type”,
,simu,“Entry$%2”,"(Entry$+1)%2");

  1. What is the meaning of normalization? What is normalized
    to what? Will the definitions

    TMultiLayerPerceptron *mlp = new TMultiLayerPerceptron
    (“int1,int2,int3,int4,int5,npeaks:5:3:type”,
    “weight”,simu,“Entry$%2”,"(Entry$+1)%2");

and

TMultiLayerPerceptron *mlp = new TMultiLayerPerceptron
("@int1,@int2,@int3,@int4,@int5,@npeaks:5:3:type",
 "weight",simu,"Entry$%2","(Entry$+1)%2");

produce different results? I can see any significant difference
in my example …

  1. in the call

mlp->Evaluate(0,params)

what is the meaning of the first variable (index)? I use
the value 0 as in the ROOT tutorials example mlpHiggs.C,
but I don’t really understad why …

  1. OK, what is the smartest way to define the NN? How many
    hidden layers and how many neurons in hidden layers
    will give the best results? Can somebody suggest
    Getting-Started/Simple-and-Elementary introduction
    (book/article/paper/URL) to designing NN that best
    fits the problem at hand?

    Or is everything just trial and error? (I doubt that …)

                                       Cheers, Emil

[quote=“emil”]1. What is the meaning of weights in a call to train NN?

For example, after styling my program after the tutorial  
mlpHiggs.C, I define NN as:

TMultiLayerPerceptron *mlp = new TMultiLayerPerceptron
("int1,int2,int3,int4,int5,npeaks:5:3:type",
 "weight",simu,"Entry$%2","(Entry$+1)%2");

[/quote]
The weight allows you to give more importance to some events during the learning process. This can be used if you have events with different probabilities (or cross-sections), or if your training sample has more events of type “0” than events of type “1”. When you train a network to distinguish signal from background, you must take care to have Integral(Signal)~Integral(Background) to obtain a well-behaved NN.

There are two constructors without the weight: TMultiLayerPerceptron(const char* layout, TTree* data = 0, const char* training = "Entry$%2==0", const char* test = "", TNeuron::NeuronType type = TNeuron::kSigmoid, const char* extF = "", const char* extD = "") TMultiLayerPerceptron(const char* layout, TTree* data, TEventList* training, TEventList* test, TNeuron::NeuronType type = TNeuron::kSigmoid, const char* extF = "", const char* extD = "")Please refer to the reference guide.

[quote] 3. What is the meaning of normalization? What is normalized
to what? Will the definitions[/quote]
Inputs and output should be “normalized” so that the mean is 0 and the RMS 1. This is not mandatory, but the training will be more efficient in that case, since [-1,1] is the natural range covered by the sigmoid function. Of course, normalizing discrete variables does not make much sense.

[quote] 4. in the call

mlp->Evaluate(0,params)

what is the meaning of the first variable (index)? I use
the value 0 as in the ROOT tutorials example mlpHiggs.C,
but I don’t really understad why …[/quote]
This is the index of the output neuron you are interested in. In your case, there is a single neuron, so index=0.

[quote]4. OK, what is the smartest way to define the NN? How many
hidden layers and how many neurons in hidden layers
will give the best results? Can somebody suggest
Getting-Started/Simple-and-Elementary introduction
(book/article/paper/URL) to designing NN that best
fits the problem at hand?
Or is everything just trial and error? (I doubt that …)[/quote]
I don’t know about rules to have a good network. In general, having N inputs and a single output, a single hidden layer with from N/2 to 2N neurons is enough. But for that, maybe somebody else can answer better.

Cheers,
Christophe.

Thanks for your prompt and clarifying replay Christophe.

Just one quick follow up question: you write that in regards to normalization it is more
efficient to “normalize” inputs and the output so that the mean is 0 and the RMS 1. Should
I then set output signal to 1 and background to -1? In your mlpHiggs.C tutorial example
the signal is type=1 and background type=0 …? Would that make any difference?

                                                                                            Regardfs, Emil