The utility of NN in complex data analyses is (almost) unquestionable.
It’s also (almost) unquestionable that ROOT is a good choice for a
data analysis framework . Given these two facts it’s easy to
see what drove people who wrote ROOT-to-JETNET and a couple of
ROOT-to-SNNS interfaces (JETNET and SNNS being two rather solid NN
packages). However, it is kind of odd to keep your data in ROOT
TTree(s) and then interface it to NN simulators, and it’s even bigger
inconvenience to have to learn the language those NN simulators speak.
So, the implementation of some NN functionality in ROOT is a very
welcome step. But why not do it in a professional manner? I mean,
why not design it in such a way that it addresses/handles typical
problems that arise when one is using NNs? Below are a few
examples/concerns:
-
the typical situation is that the learning samples for SIG and BKG
are obtained from different sources (often one comes from some sort of
Monte Carlo simulation, while the other comes from a different Monte
Carlo simulation or real data). For that reason they are often kept in
different files or, in differently named trees in the same file. Why
is it that TMultilayerPerceptron provides no way to say, look here’s
my TTree for SIG (target=1), and here’s my TTree for BKG (target=0).
Why is the user burdened with merging the two TTree’s (which, rather
ironically, takes half of the mlpHiggs.C example)? -
why does the TMLPAnalyzer->DrawNetwork() draw what it draws? If
anything, I would expect it to draw the same thing that
TMultiLayerPerceptron->Draw() does. And why (if it draws anything) user
has no control over the style (or has to jump through a dozen of hoops in
order to get that control). What is the point of DrawNetwork() for
neuron!=0? If there is no point, why have Int_t neuron as an argument?
If there is a point – where is that documented and how one makes
sense of what (s)he sees? -
for simple classification problem (sig/bkg ~ 1/0) well trained NN
should have a property that SIG/(SIG+BKG) is a linear function of
NN_output. There has to be a method in TMLPAnalyzer that checks this. -
why is that if one executes the example (mlpHiggs.C) a few times in
a raw (starting a new ROOT session each time or not) the result is
different. If that’s the desired behavior, why is it not announced?
How does one switch it off? I thought that the example should answer
questions, not cause additional ones. -
TMLPAnalyzer->DrawDInputs(): is this the best way to illustrate
which variable/input is relevant and which one is not? I thought that
correlation to target is a good quantitative measure of how relevant a
particular variable/input is. How is one to interpret the picture
this method draws. Where is that explained/documented? Why are axes
no labeled? -
I heard that if one is to be completely unbiased, then one
needs three samples: the standard two (learning and testing, i.e. the
“stop training”) and another “testing sample” on which to verify the
performance of the NN, e.g., do the test I ask for in 3), do the
TMLPAnalyzer->DrawNetwork(0, “target==0”, “target==1”) thing, etc. -
I think the TMultiLayerPerceptron->Train() method needs a better
formatted output. It would also be nice if one could tell it to save
the best NN to file, so that in case that after 20 hours of training
the user wants to interrupt (or a machine goes down, or whatever), the
user has still something to work with.
eight) what happens if the user asks to train for 500 epochs and the
network will start getting over-trained after epoch 300? Will the user
get an over-trained network, or the “best” one will be saved at epoch
300? Where is this documented/explained?
I sure have more questions/suggestions/items_to_discuss…
Konstantin.