Vector entries in TMVAClassification and Application

Hello dear ROOTforum,

I have trouble with TMVA’s macros. The tree source has branches type vector, in the part of dataloader I put this

  dataloader->AddVariable( "B_J_px2", "B_J_px2", "units", 'F' );

and works fine. But in the part of Application, I stated the same variables in the part of classification (like tutorial explain)

TMVA::Reader *reader = new TMVA::Reader( "!Color:!Silent" );
   // Create a set of variables and declare them to the reader
   // - the variable names MUST corresponds in name and type to those given in the weight file(s) used

   Float_t  B_J_px2; ...
   reader->AddVariable( "B_J_px2", &B_J_px2 );...

TFile* datafile = TFile::Open("Datafile.root");
std::cout << "TMVAClassificationApp :Using input file: " << datafile->GetName() <<std::endl; `

std::cout << "--- Select signal sample" << std::endl;
TFile* signalfile = new TFile("MC_file.root");  //open the file
TTree* signaltree = (TTree*)signalfile->Get("rootuple/ntuple");
  
signaltree->SetBranchAddress( "B_J_px2", &B_J_px2 );

The application macro compiles well, but when I execute it I got this:

Error in <TTree::SetBranchAddress>: The pointer type given "Float_t" (5) does not correspond to the type needed "vector<float>" by the branch: B_J_px2

I thought that the solution it could be put ‘vector’ in the part of dataloader, I mean

 dataloader->AddVariable( "B_J_px2", "B_J_px2", "units", 'vector<float> type' );

but reading TMVA documentation (https://root.cern/doc/v612/classTMVA_1_1DataLoader.html#af2de13debc441fd2c4f1cd826ef175f9) I saw that not is possible.

I saw this contribution in the forum (Vector in Branch) but I can’t do that, due to I have to redeclare the variables and it isn’t allowed.

Another thing that I think it could be the solution is reading vector branches as arrays of float variables, in the SetBranchesAddress, but I don’t know if it affects the reading of weights.

Could you please help me with this? I attach the macros of classification and application and a link with the root files. TMVAClassification_Bmesonfinal.C (23.0 KB)
TMVAClassificationApplication_Bmesonfinal.C (29.3 KB)

Links: https://drive.google.com/open?id=1bHeUDKmW74VC4dOjK2OYQkfCfjcaJS9a
https://drive.google.com/open?id=1WOzMChyvWe-H64w-eihM0gfkkVUPpqR4

Thanks for your attention and sorry for bothering you,
Cheers,
Karen

It looks like this is the problem. If the branch in the file is really a vector, it is not possible to use the address of a single float in SetBranchAddress(). You need to use an equivalent vector type in that case. Using a vector in a branch is discussed also in this other post, if you’d like to have a look.

Note that the way to post code is to enclose it with three back-quotes as described here.

1 Like

Hi Karen,

Do note that adding a vector valued variable will create one TMVA event for each entry in the vector/array. If you have 10 entries in your tree and your array leaf has 10 elements you would add in total 100 event to TMVA.

It might be the case that you want to add each entry in the vector as a separate feature. In this case you would have to do something in line with

dataloader->AddVariable("B_J_px2[0]"); // Add 1st element as separate feature
dataloader->AddVariable("B_J_px2[1]"); // Add 2nd element as separate feature
dataloader->AddVariable("B_J_px2[2]"); // Add 3rd element as separate feature

Cheers,
Kim

Hi kialbert,

I do that you suggest in this form, I put only one vectorial branch of my tree source and I wrote the code as:

...
 dataloader->AddVariable( "nB", "nB", "units", 'i' );
 dataloader->AddVariable( "nMu", "nMu", "units", 'i' );
//Vector branch:   dataloader->AddVariable( "B_px", "B_px", "units", 'F' ); 
TBranch *branch = (TBranch*)datatree->GetBranch("B_px");
Long64_t nentries = branch->GetEntries();

for(Long64_t i=0; i<= nentries - 1; i++)
  {
   dataloader->AddVariable("B_px[i]", "B_px[i]", "units", 'F' );
  }
...

Because of the branch has a lot of entries (~40,000), I could compile the code but in the execution, just read the input variables and says “MLP … Building network” and then it does nothing.

After, I saw the TBrowser, and when I tried the open the resulting root file out is an error:

Error in <TFile::ReadBuffer>: error reading all requested bytes from file TMVAClassification_Bmesonprueba.root, got 268 of 300
Error in <TFile::Init>: TMVAClassification_Bmesonprueba.root failed to read the file type data.

I tried with type Int_t, instead of Long64_t and neither works. What else can I do?

Really thanks for your attention

Cheers,
Karen

Hi,

You don’t want to add a variable per entry in the branch, rather you want to add a variable for each element in the vector. (Assuming the branch B_px is still a vector).

Int_t vector_size = 5;
for(Long64_t i=0; i < vector_size; ++i) {
   TString branch_name = Form("B_px[%i]", i);
   dataloader->AddVariable(branch_name, branch_name, "units", 'F' );
}

If the vector is of variable size we need to consider another approach. Building a classifier (especially and old one like MLP) with 40000 inputs can indeed take some time to initialise and run.

Cheers,
Kim