Branches needed to evaluate a TTreeFormula

I’m using a TTreeFormula in a loop over all entries of a TTree. At the moment i leave all branches enabled to ensure the formula has all inputs:

myQuasiCode(std::vector<TString> copyvars, std::vector<std::pair<TString,TString>> formulas) {

if (formulas.empty()) {
  tree->SetBranchStatus("*",0);
  for (auto var : copyvars) {
     tree->SetBranchStatus(var,1);
  }
} else {
  /// TODO: do something smarter here
  tree->SetBranchStatus("*",1);
}

std::vector<TTreeFormula*> m_formulas;
std::vector<float> m_formVars(formulas.size(),0.f);
for (size_t i = 0 ; i < formulas.size() ; ++i) {
  m_formulas.emplace_back(new TTreeFormula(formulas[i].first.c_str(),formulas[i].second.c_str(),tree));
}

// then create a new tree with branches for all the variables which get copied, create branches for the formulas

for(int i = 0; i < tree->GetEntries(); ++i){
   tree->GetEntry(i);
    for (size_t i = 0 ; i < m_formulas.size() ; ++i) {
    /// yes, I'm shadowing i here
      m_formVars[i] = m_formulas[i]->EvalInstance();
    }
    // do all sorts of other stuff
}

So I’m looking for something like the following:

  for (auto form : formulas) {
     for (auto var: form.NeededBranches()) {
        tree->SetBranchStatus(var,1);
     }
  }

(sure, var can be a branch of which i need to get the name… let’s call this a technical detail)

the main point is, the formulas come from elsewhere in the repository, or even code in another subpackage (assuming someone ever wants to reuse my code). the branches could be set at the place where the formula is defined, but … let’s say this is a job a computer should do much better: copying variable names out of a formula, without making typos, without forgetting the last variable, not taking into account that when changing “piplus_P” in the formula to “piminus_P” one also needs to update the list of branches … bottom line, i want the branches to switch on/off rather automatically for maintenance reasons.

Hi

[quote]So I’m looking for something like the following:[/quote]The good news is that this is superfluous.

[quote]bottom line, i want the branches to switch on/off rather automatically for maintenance reasons.[/quote]This already the case for TTreeFormula. TTreeFormula will explicitly read only the branches that are involved in the formula.

Cheers,
Philippe.

Hi,

it seems not the way i expect it to work (this is now not in compiled code like what i really want, but i hope demonstrates it):

[ins] pseyfert@robusta /tmp > root -l DTT_Bs2DsMuNu_66.root                                            19:23:50
-------------------------
Set LHCb Style - Feb 2012
-------------------------
root [0] 
Attaching file DTT_Bs2DsMuNu_66.root as _file0...
(TFile *) 0x2e6b6d0
root [1] gDirectory->cd("Bs2DsMuNuTuple")
(Bool_t) true
root [2] TTree* t = DecayTree
(TTree *) 0x31034b0
root [3] double bbb
(double) 0.00000
root [4] t->Print("Bs_MCO*")
******************************************************************************
*Tree    :DecayTree : DecayTree                                              *
*Entries :     7416 : Total =        64337281 bytes  File  Size =   35310634 *
*        :          : Tree compression factor =   1.81                       *
******************************************************************************
*Br    0 :Bs_MCORR  : Bs_MCORR/D                                             *
*Entries :     7416 : Total  Size=      59982 bytes  File Size  =      50622 *
*Baskets :        2 : Basket Size=      32000 bytes  Compression=   1.18     *
*............................................................................*
*Br    1 :Bs_MCORRERR : Bs_MCORRERR/D                                        *
*Entries :     7416 : Total  Size=      60000 bytes  File Size  =      54228 *
*Baskets :        2 : Basket Size=      32000 bytes  Compression=   1.10     *
*............................................................................*
*Br    2 :Bs_MCORRFULLERR : Bs_MCORRFULLERR/D                                *
*Entries :     7416 : Total  Size=      60024 bytes  File Size  =      53620 *
*Baskets :        2 : Basket Size=      32000 bytes  Compression=   1.11     *
*............................................................................*
root [5] t->SetBranchAddress("Bs_MCORR",&bbb)
(Int_t) 0
root [6] TTreeFormula* formula = new TTreeFormula("foobar","Bs_MCORR+10*Bs_MCORRERR",t)
(TTreeFormula *) 0x22dde60
root [7] t->SetBranchStatus("*",0)
root [8] t->SetBranchStatus("Bs_MCORR",1)
root [9] t->GetEntry(3)
(Int_t) 8
root [10] bbb
(double) 4561.71
root [11] formula->EvalInstance();
root [12] formula->EvalInstance()
(Double_t) 4561.71
root [13] t->SetBranchStatus("*",1)
root [14] t->GetEntry(3)
(Int_t) 8364
root [15] formula->EvalInstance()
(Double_t) 6502.77
root [16] 

Though I do see an error message at runtime when changing the order:

[ins] pseyfert@robusta /tmp > root -l DTT_Bs2DsMuNu_66.root                                            19:25:57
-------------------------
Set LHCb Style - Feb 2012
-------------------------
root [0] 
Attaching file DTT_Bs2DsMuNu_66.root as _file0...
(TFile *) 0x23eaa80
root [1] gDirectory->cd("Bs2DsMuNuTuple")
(Bool_t) true
root [2] TTree* t = DecayTree
(TTree *) 0x26243f0
root [3] double bbb
(double) 0.00000
root [4] t->Print("Bs_MCO*")
******************************************************************************
*Tree    :DecayTree : DecayTree                                              *
*Entries :     7416 : Total =        64337281 bytes  File  Size =   35310634 *
*        :          : Tree compression factor =   1.81                       *
******************************************************************************
*Br    0 :Bs_MCORR  : Bs_MCORR/D                                             *
*Entries :     7416 : Total  Size=      59982 bytes  File Size  =      50622 *
*Baskets :        2 : Basket Size=      32000 bytes  Compression=   1.18     *
*............................................................................*
*Br    1 :Bs_MCORRERR : Bs_MCORRERR/D                                        *
*Entries :     7416 : Total  Size=      60000 bytes  File Size  =      54228 *
*Baskets :        2 : Basket Size=      32000 bytes  Compression=   1.10     *
*............................................................................*
*Br    2 :Bs_MCORRFULLERR : Bs_MCORRFULLERR/D                                *
*Entries :     7416 : Total  Size=      60024 bytes  File Size  =      53620 *
*Baskets :        2 : Basket Size=      32000 bytes  Compression=   1.11     *
*............................................................................*
root [5] t->SetBranchAddress("Bs_MCORR",&bbb)
(Int_t) 0
root [6] t->SetBranchStatus("*",0)
root [7] t->SetBranchStatus("Bs_MCORR",1)
root [8] TTreeFormula* formula = new TTreeFormula("foobar","Bs_MCORR+10*Bs_MCORRERR",t)
Error in <TTreeFormula::DefinedVariable>: the branch "Bs_MCORRERR" has to be enabled to be used
Error in <TTreeFormula::Compile>:  Part of the Variable "Bs_MCORRERR" exists but some of it is not accessible or useable
(TTreeFormula *) 0x1818700
root [9] t->GetEntry(3)
(Int_t) 8
root [10] bbb
(double) 4561.71
root [11] formula->EvalInstance()
(Double_t) 4561.71
root [12] 

HI,

Cheers,
Philippe.

Hi,

we’re talking about trees with >1M entries and somewhat between 1000 and 5000 branches of which the TTreeFormulas will need say less than 15 and the remaining code maybe 30. So i assume disabling the unneeded branches makes sense for performance reasons. And the number of needed branches is, I’d say at a level where bugs will be introduced if handled by hand, and a computer should be able to pick the branches very reliably (as we’ve just seen from the error message, ttreeformula is already able to detect that a necessary branch is disabled).

Cheers,
Paul

Hi,

[quote]So i assume disabling the unneeded branches makes sense for performance reasons.[/quote]One genuine question is the performance of what.

Really, if all you use is TTreeFormula, you do not need to disable the branches and will still get maximum performance. TTreeFormula will only load/read the branches it needs and nothing else.

[quote]and the remaining code maybe 30.[/quote]For those case, I still prefer the direct approach, i.e. just call GetEntry on those 30 branches. If instead you use disable and TTree::GetEntry, one down-side is that the current code for each TTree::GetEntry will loop through all the branches to ask if they are enabled or not.

Cheers,
Philippe.

PS. The information you are asking about is (of course) available via TTreeFormula::GetLeaf which takes an integer from 0 to TTreeFormula::GetNcodes.

Hi,

indeed i was unspecific. My main performance interest is time the programs takes, my guess is the limit is disk i/o. and i always had the impression (never measured it though) that looping over ttrees is faster when disabling unnecessary branches.

But that aside, I think filling a vector with all the Branches I need, to call GetEntry for the branches, is doable as well. At that point I stil need to know which Branches are needed by TTreeFormula, which GetLeaf - which you just pointed to - will provide.

Thanks,
Paul