Home | News | Documentation | Download

TMVA: Separately passing Train and Test Tree for Signal and Background

Dear rooters,

I have an apparently silly question that is bothering me since two weeks.

I have an old code using TMVA where I add signal and background in the following way:

 fTMVAdataloader[iTMVA]->AddSignalTree(signalTree, signalWeight);
 fTMVAdataloader[iTMVA]->AddBackgroundTree(backgroundTree, backgroundWeight);

Now I would like to specifically separate Train and Test tree as:

 fTMVAdataloader[iTMVA]->AddSignalTree(signalTestTree, signalWeight, TMVA::Types::kTesting);
 fTMVAdataloader[iTMVA]->AddSignalTree(signalTrainTree, signalWeight, TMVA::Types::kTraining);
 fTMVAdataloader[iTMVA]->AddBackgroundTree(backgroundTestTree, backgroundWeight, TMVA::Types::kTesting);
 fTMVAdataloader[iTMVA]->AddBackgroundTree(backgroundTrainTree, backgroundWeight, TMVA::Types::kTraining);

In both cases, I am running them with the following options: “nTrain_Signal=0:nTrain_Background=0:SplitMode=Alternate:NormMode=NumEvents:!V”

In the first case, TMVA automatically uses even events for train and odd events for test.
In the second case, I manually divided the original two files in four final files using the same criteria.

If I check the variables used as spectator or for BDT, the output histograms are exactly the same in the two cases, confirming that I have correctly split the files in the second case.

However, the distribution of the BDT variable is different and I noticed that it changes if, in the second case, I change the “SplitMode” between “Alternate”, “Random” or “Block”.

Originally, I thought this flag has no meaning in the second case, but I have now realized that it does mean, but I do not understand which is the correct configuration.

My question is: if I want implement TMVA as in the second case but obtain exactly the same training as in the first case, which options or commands are needed?

Thank you!

Maybe my question was too long, saying it in a simpler way:

  • in general we separately pass signal and background files and specify to TMVA how to divide each sample in a training and a test subsamples
  • instead, how can I change this approach, separately specifying four files, i.e. training-signal, test-signal, training-background and test-background?

Hi @kormoranos ,
I think we need our TMVA expert @moneta , let’s ping him.

I would need to check your second approach. Can you please post your files and your code,