TTree:Fill() disturbs TTreeFormula::GetNdata() on vectors

Dear all,

I have the following use case:

  • generate events in the main loop,
  • apply selections and fill histograms : we do this by reading in variables and cuts as strings (from text files), and evaluating them event by event using TTreeFormula
  • fill an output tree for events that pass the selection.

I observe that for variables of type vector (or any vector), the value and formula evaluation always return to the first value. For simple doubles, no problem.

The problem appears specifically after the following sequence:

  • ttf.GetNdata()
  • ttf.EvalInstance()
  • tree.Fill()

My questions and comments are:

  • what exactly is the role of GetNdata()? It returns the variable cardinality, but does it do something else internally? In which cases can it be skipped? At some point I saw that avoiding this call would not allow proper formula evaluation.

  • in the attached example, when I comment the GetNdata line the problem goes away (and the formulas are still properly evaluated - this doesn’t seem to be the case in general though). When I don’t fill the tree, the problem also goes away (but then I don’t get an output).

Is my code not correct, or if this is a bug, is there a safe workaround?
A simple macro reproducing the problem is attached.

thanks for your help,
Maarten
tree1.C (1.63 KB)

Dear all

sorry to insist but I’d really like to know whether this is mishandling on my side, confirmed, and if so possibly a workaround.

thanks
Maarten

Hi Maarten,

TTreeFormula are not intended to be used on a TTree while it is being filled (i.e. it has never been tested before). A priori when writing you have access to the original data and it is more direct and more efficient to directly use it. In which context do you need to use TTreeFormula on the TTree being filled?

[quote]- what exactly is the role of GetNdata()? It returns the variable cardinality, but does it do something else internally? In which cases can it be skipped? At some point I saw that avoiding this call would not allow proper formula evaluation.[/quote]GetNdata is essential for any formulae whose cardinality is not always exactly 1 (So in case of arrays or collections or more exactly when GetMultiplicity() is not zero).

[quote]but does it do something else internally?[/quote]Yes, one of the side effect is ‘reading’ the data from the TTree. The entry that is being read, is the one that has been set by calling LoadTree (or GetEntry explicitly) and defaults to -1.

Also, you can not read an entry until it has been filled. So you can fix your simplified example with:

[code] // fill
t1.Fill();

 // evaluate formulas
 t1.LoadTree(t1.GetEntries()-1);
 
 if( getndata )
    ttfVector.GetNdata();
 
 cout << "Read entry is " << (long)t1.GetReadEntry() << endl;
 cout << "     ttfFloat  : py = " << py << " / " << ttfFloat.EvalInstance()  
      << "     ttfVector : py = " << vecp[1] << " / " << ttfVector.EvalInstance() << endl;[/code]

Cheers,
Philippe.

Hi Philippe

thank you very much for looking into this. This is what we do (and how the problem comes about):

1- loop over input data
2- apply a set of cuts, for object selection (electrons, muons, …)
3- then compute output variables (e.g Mee, …), which are branches of the output tree
4- then apply cuts on these
5- fill the output tree only if all cuts are passed.

Cuts are defined from text files, with strings which are read into TTreeFormulas. This is how (at step 4) we evaluate a TTreeFormula on the output tree, before filling it.

[ In addition, we also declare histograms in a similar way (reading text files into TTreeFormulas) which are automatically booked and filled at each cut level ]

This TTreeFormula mechanism is quite powerful, makes programs very configurable and can avoid a lot of coding overhead… but can the above work? do you have a suggestion on how to organize this so that it works?

A subsidiary question : GetNdata() seems to be the time consuming part of working with TTreeFormula (EvalInstance() seems by itself costless compared to working directly with variables). Is there a more efficient way to just test cardinality, to avoid the unnecessary GetNdata() calls?

best,
Maarten

[quote]4) we evaluate a TTreeFormula on the output tree, before filling it.[/quote]TTreeFormula can/should induce calls to GetEntry and thus using a TTreeFormula a TTree that is being filled is not supported (as there is too much risk of inadvertently losing the data that is being filled). In particular it is also semantically difficult to resolve ‘reading’ an entry that has not yet been Filled.

[quote]This TTreeFormula mechanism is quite powerful, makes programs very configurable and can avoid a lot of coding overhead… but can the above work? do you have a suggestion on how to organize this so that it works?
[/quote]I see, this is indeed possibly quite useful. One possibility is to use an auxiliary in memory tree that is keeping ‘just’ the last entry to do those calculations. Another possibility would be to use a set of ‘interpreted functions’ to achieve the same type of results.

[quote] Is there a more efficient way to just test cardinality, to avoid the unnecessary GetNdata() calls?[/quote]Yes, this is the purpose of TTreeFormula::GetMultiplicity.

Cheers,
Philippe.

PS. tree1.C using an auxiliary in-memory tree as temporary input/output and using GetMultiplicity:[code]
void tree1w( bool getndata )
{
gRandom->SetSeed( 123456 );

//create the Tree and a few branches

TTree t1(“t1”,“a simple Tree with simple variables”);

Float_t px, py, pz;
t1.Branch(“px”,&px,“px/F”);
t1.Branch(“py”,&py,“py/F”);
t1.Branch(“pz”,&pz,“pz/F”);

vector vecp;
t1.Branch(“vecp”,&vecp);

// create tree formulas

TTree *staged_tree = t1.CloneTree(0); // The 2 tree shared structure and user objects’ addresses.
staged_tree->SetDirectory(0);
staged_tree->SetCircular(1);

TTreeFormula ttfFloat(“ttfFloat”,“py”,staged_tree); // evaluates a float == py
ttfFloat.SetQuickLoad(kTRUE);
TTreeFormula ttfVector(“ttfVector”,“vecp”,staged_tree); // evaluates a vector element ==py
ttfVector.SetQuickLoad(kTRUE);

// event loop

for (Int_t i=0;i<10;i++) {

// reset

vecp.clear();

// generate event

gRandom->Rannor(px,py);
pz = px*px + py*py;

vecp.push_back(px);
vecp.push_back(py);
vecp.push_back(pz);
 for(int j = i; j < 10; ++j) {
    vecp.push_back(pz);
 }        

// evaluate formulas

 staged_tree->Fill();
 staged_tree->LoadTree(0);
 
 if( getndata ) {
    if (ttfFloat.GetMultiplicity()!=0) cout << ttfVector.GetNdata() << " ";        
    if (ttfVector.GetMultiplicity()!=0) cout << ttfVector.GetNdata() << " ";
 }
 cout << "     ttfFloat  : py = " << py << " / " << ttfFloat.EvalInstance();
 cout << "     ttfVector : py = " << vecp[1] << " / " << ttfVector.EvalInstance(); 
 cout << endl;

// fill

t1.Fill();    

}
}
[/code]