TMVA-Wrong Number of Entries w.r.t Inputs

_ROOT Version: 6.24.06
_Platform: centos7.6
Compiler: Not Provided


Hello everyone,
I am currently using the tmva framework to build a BDT.
I will explain in two parts : one version that works, another that does not.
Works :
To get my inputs for my BDT (signal and Background), i have a code that builds two trees => one for signal and one for Background.
The good thing here is that I only need to fill my trees with branches that have the same number of entries.
Therefore, I only needed to Fill my tree after the value of my variables were assigned to my branches
To simplify:

for loop
{
myvariable1 = myvalue1;
myvariable2 = myvalue2;
etc..
if (condition){treeSignal->Fill();}
else{treeBkg->Fill();}
}

With this configuration, the number of entries for all my variables is good for all my variables when I look at the “Input Variables” from the TMVAGui.
Does not work :
I am still building two trees however the different branches are not filled at the same time.
I know that this specific subject has already been covered and by doing this:

treeSignal->GetBranch("name_of_branch1")->Fill();
...//same with TreeBkg
treeSignal->GetBranch("name_of_branch2")->Fill();//with different number of entries
...
treeS->SetEntries(totalNumberofEntriesofAllBranches);

This actually works fine when i look at my trees, all the branches have the right amount of entries.
Now, this issue comes when I give these trees as inputs to my BDT.
TMVA takes “totalNumberofEntriesofAllBranches” as the number of entries for all variables while all htese varaibles don’t have the same number of entries. It gives this weird behavior where it fills the variable (having less number of entries) with a certain value until reaching the “totalNumberofEntriesofAllBranches” (see picture of the TMVAGui->Input Variables)
Screenshot from 2023-02-22 14-22-08
The peaks being the weird behavior that does not show when I look at this distribution in my tree

Has anyone ever encountered that behavior or am I mis-using the SetEntries in the case that it does not work.
Sorry for being this long, I hope I made it clear.
Thanks to anyone that reads all of this!
Paul

Hello @Threshic ,

a small, self-contained reproducer for the issue would really help us figure out what’s going wrong.

Other than that maybe @moneta has other suggestions.

Cheers,
Enrico

Hi,
yes please post the example code and especially the file containing the signal and background trees that do not work. I have not understood how is the structure of this TTree.

Lorenzo

I tried to make it as simple as possible w.r.t what my code is .

Basically, I am looping over muons and filling the basic informations (mva_muon_pt, mva_muon_eta,…).
Now I added two other informations that are not filled at the same time as the basic infos of the muons =>
mva_Met && mva_Mmumu. => I have less entries for these. You shouldn’t see any issue with this in the output50_Neu_Forum.root file. However, after training, these variables with less entries will be filled with a certain value (that i don’t know how it is defined) until reaching the max number of entries.
Files containing the trees : output50_Neu_Forum.root (I am actually using one file for the SIgnal and another one for the background but the issue is independant of the input file).I compile using runscript.C
output50_Neu_Forum.root (1.2 MB)

TreeReader.h (87.7 KB)
TreeReader.C (18.2 KB)

The output50 file is built using TreeReader.c and TreeReader.h and “heavy” Ntuple (>100MB file). I actually need the headers (i tried to be fast, but if you really need something less “messy”, i can change my code. You can actually train a BDT using the output50_Neu_Forum file) : HistoManager.h and HistoManager.cc // Muon.h and Muon.cc // PrimaryVertex.cc && PrimaryVertex.h, etc
PrimaryVertex.h (1.4 KB)
Track.h (12.0 KB)
TreeFormat.h (30.2 KB)
PFJet.cc (823 Bytes)
Track.cc (11.5 KB)

runscript.C (368 Bytes)
TreeReader.h (87.7 KB)
DeltaFunc.h (507 Bytes)
HistoManager.h (1.5 KB)
Muon.h (1.9 KB)
PFJet.h (1.7 KB)
runscript.C (368 Bytes)
TreeReader.C (18.2 KB)
PrimaryVertex.cc (719 Bytes)
Muon.cc (1.1 KB)
HistoManager.cc (3.4 KB)

Thank you for posting all this information. Sorry for the delay, I’ll try to look at it

Lorenzo