TTree split level and custom classes for SetBranchAddress

ROOT Version: 6.32

Hi all,

I am a bit confused about how split branches work with SetBranchAddress when using custom classes, and I struggle to find answers.

Let’s say I have the following two object definitions:

struct Evt: public TObject
{
  int id;
  Hit mainhit;
  std::vector<double> w;
  std::vector<Hit> hits;
  ClassDef(Evt)
}

and

struct Hit :  public TObject
{
  int id;
  double t;
  vector<double> pos;
  ClassDef(Hit)
}

Let’s say I create a TTree with a branch of type Evt, which I also name “Evt”. Later, I read that TTree by doing:

TTree* EvtTree = (TTree*) somefile->Get("treename");
Evt* evt = nullptr;
EvtTree->SetBranchAddress("Evt", &evt);

But if I am trying to take advantage of split branches, what else could I set the branch address to? Evt.id? Evt.mainhit? Evt.mainhit.id? Evt.w? Evt.w[0]? Evt.hits[0].t? Evt.hits[0].pos?

If some of those would require some light rewrite of the structures to get it to work, I would also like to know!

Thanks a lot for your help,

Francisco

Hi Francisco,

Here you can find the doc you are looking for Trees - ROOT

I would suggest to also get rid of the TObject inheritance for your data model, unless you really need it, it does not bring any advantage for what can be seen in the code you shared.

Cheers,
D

Check out this related thread too:

you can face issues with alignment/padding, which in your example are further complicated by variable-size vectors, so it’s safer/easier to split the variables into individual branches.

The easiest is to access them indirectly by using RDataFrame.

Alternatively you can also create a skeleton that give you access to the content of the branch by calling TTree::MakeSelector. Another alternative if you can load the library containing the definition of Evt and Hit is to set the address of the top level branch but only partially load its content by using SetBranchStatus or better yet by calling LoadTree and then GetEntry on only the branches you need.

Thank you all for your quick replies!

Just to clarify, I do have access to the library where Evt and Hit are defined. I can SetBranchAddress to the Evt branch, and then access its members accordingly. In practice, there are many other member variables. What I was trying to do is create a function that takes the name of a member variable and returns its value (as preparation for a fit, because multiple member variables are potential measures of the “energy” of the event). While I could hard-code the different options, being able to input the sub-branch as a string to be read would make it easier to maintain.

@Danilo : the code I shared is a snippet of a more complex structure from a library my collaboration uses. I do not know why they inherit from TObject, but I keep in mind for my own structures that it’s not necessary. The documentation you pointed to implies that I should be able to SetBranchAddress directly to a vector and to its individual elements (point 3 of “Creating branches”), but I get “Error in TChain::SetBranchAddress: unknown branch” when I try to do so. Are there any likely reasons for this?

@dastudillo : what do you mean by safer/easier? If I understand the link you pointed to, this only works if all of the member variables are of the same kind, or at most the last one is of a different, “smaller” type? Is your recommendation that I get these member variables outside of the struct instead, as their own branches, if I want to read them separately?

@pcanal : I will look into RDataFrame, thanks! I think based on the additional information I gave, the rest of your reply is not applicable?

Indeed, I think to avoid complications it’s better to store and read them as separate branches, not “grouped” in a structure or object, unless you really need that – individual arrays or vectors can be branches like any ‘scalar’ variable, no problem, but the link I shared, like your issue, is about “grouping” more than one variable/array/vector with different types/sizes in a struct, which needs some extra care as discussed in that post. Or you can try to read them with help from MakeSelector, or with RDataFrame, the other suggestions (I haven’t used much RDataFrame, so I don’t know if reading vectors --possibly of different sizes-- ‘just works’ or needs some care, but searching the forum may find examples).

but I get “Error in TChain::SetBranchAddress: unknown branch”

A priori the branch name is misspelled.

What I was trying to do is create a function that takes the name of a member variable and returns its value

For TTree, you can use TTreeFormula for this purpose (it is create by using ‘full path’ to the member name and will return a double value and can be iterate over if the member is a collection or contained in a collection.

RDataFrame can also be fed/constructed via string information.

Hm, so this made me double-check my process. Here is a short version running from the interpreter:

$ root -l /path/to/my/file
[] gSystem->Load("/path/to/my/lib/lib.so");
[] Evt* evt = 0;
[] Tree->SetBranchAddress("Evt", &evt)
(int) 0
[] vector<double>* w = 0;
[] Tree->SetBranchAddress("Evt.w", &w); //no complaints with either "Evt.w" or "w"
(int) 0
[] Tree->GetEntry(0)
(int) 1569
[] evt->w.size()
(unsigned long) 23
[] w->size()
Error in <HandleInterpreterException>: Trying to dereference null pointer or trying to call routine taking non-null arguments
Execution of your code was aborted.
ROOT_prompt_28:1:1: warning: null passed to a callee that requires a non-null argument [-Wnonnull]
w->size()
^~~~~~

[] vector<double>* w_test = 0;
[] Tree->SetBranchAddress("Evt->w", &w_test);
Error in <TTree::SetBranchAddress>: unknown branch -> Evt->w
(int) -5

[] double single_weight;
[] Tree->SetBranchAddress("Evt.w[0]", &single_weight)
Error in <TTree::SetBranchAddress>: unknown branch -> Evt.w[0]
(int) -5

Any errors in the approach here? Or is this something that just has to be done with TTreeFormula?

For whatever the additional context is worth, doing Tree->Print() shows all sub-members recursively, e.g.:

[...]
*............................................................................*
*Br   49 :trks.dir.x : Double_t x[trks_]                                     *
*Entries :   149448 : Total  Size=    2997111 bytes  File Size  =    2127140 *
*Baskets :       76 : Basket Size=    5000000 bytes  Compression=   1.41     *
*............................................................................*
[...]

where

  • Evt is the top level branch, and does not appear in the subbranch name,
  • trks is a vector<Trk> member of Evt (Trk is a custom struct),
  • dir is a Vec member of trks (Vec is also a custom struct),
  • x is a double member of dir,

and doing cuts such as Evt->trks[0].dir.x>0 on the TTree “just works” :trade_mark:.

Those are accurate messages. There is no branch by those names. The only valid branch name is Evt.w.

[] w->size()
Error in <HandleInterpreterException>: Trying to dereference null pointer or trying to call routine taking non-null arguments

Yes, setting the address of a nested object is delicate. The following may work:

[] Evt* evt = 0;
[] Tree->SetBranchAddress("Evt", &evt)
// Get a first entry to set things up
[] Tree->GetEntry(0)
(int) 0
[] vector<double>* w = new std::vector<double>;
// Override the address partially
[] Tree->SetBranchAddress("Evt.w", w); //no complaints with either "Evt.w" or "w"
[] Tree->GetEntry(0)
[] if (w && w->size()) std::cout << "Success we read " << w->size() << " vector values\n";

Or is this something that just has to be done with TTreeFormula?

It can be done with SetBranchAddress, it is just ‘hard’

and doing cuts such as Evt->trks[0].dir.x>0 on the TTree “just works” :trade_mark:.

Yep, TTreeFormula does all the hard work there :slight_smile:

Another option for you is to use another set of classes: the TTree Readers. The result of MakeSelector gives you examples on how to set it up.

Sadly this did not work. w is empty after the last step, while evt->w isn’t.

I also did a quick check with MakeSelector, but I don’t understand how to use it. This is what I get:

root [12] Tree->MakeSelector("Evt")
Error in <AnalyzeBranch>: Arrays inside collections are not supported yet (branch: trks.any).
Error in <AnalyzeBranch>: Arrays inside collections are not supported yet (branch: mc_trks.any).
(int) 0
root [15] Tree->MakeSelector("trks.dir")
Error in <AnalyzeBranch>: Arrays inside collections are not supported yet (branch: trks.any).
Error in <AnalyzeBranch>: Arrays inside collections are not supported yet (branch: mc_trks.any).
(int) 0
root [13] Tree->MakeSelector("@trks.dir")
Error in <AnalyzeBranch>: Arrays inside collections are not supported yet (branch: trks.any).
Error in <AnalyzeBranch>: Arrays inside collections are not supported yet (branch: mc_trks.any).
(int) 0
root [16] Tree->MakeSelector("trks.dir.x")
Error in <AnalyzeBranch>: Arrays inside collections are not supported yet (branch: trks.any).
Error in <AnalyzeBranch>: Arrays inside collections are not supported yet (branch: mc_trks.any).
(int) 0

All of the resulting files are empty.

I finally tried with TTreeFormula, since it seemed the simplest. This is what I got to work:

$ root -l /path/to/my/file.root
[] gSystem->Load("/path/to/my/lib/lib.so");
[] Evt* evt = 0;
[] Tree->SetBranchAddress("Evt", &evt);
(int) 0
[] TTreeFormula formula("formula", "Evt->trks[0].dir.x", Tree);
[] Tree->GetEntry(0)
(int) 1569
[] evt->trks[0].dir.x
(double) -0.71326228
[] formula.EvalInstance()
(double) -0.71326228

So, for anyone coming after me, you can see that the formula approach gets you the same value as the “manual” SetBranchAddress approach. It seems that both “->” and “.” work in the formula. However, this does not seem to work if the formula points to a vector or some other complex type. That’s not a problem for me at the moment, but might be for others.

Thanks a lot everyone for your quick help :slight_smile:

Sadly this did not work. w is empty after the last step, while evt->w isn’t.

:frowning: but not surprising …

The parameter is the stem of a filename. After this step you should have a new file Evt.h that contains the skeleton. i.e. Tree->MakeSelector("myselector") is a more typical usage :slight_smile:

Error in <AnalyzeBranch>: Arrays inside collections are not supported yet (branch: trks.any).
Error in <AnalyzeBranch>: Arrays inside collections are not supported yet (branch: mc_trks.any).

These error means that the TreeReader will not be able to provide access to these 2 branches but will be able to provide access to all the other branches. See the content of Evt.h (or myselector.h) for more details.

It should work. When the formula is a ‘collection, nested or not’ of things, the collection is semantically linearlized (for example a 2D array (and all collections) will be presented as a single array). The number of elements in the linear collection is returned by TTreeFormula::GetNdata and you access the individual element by passing the index to EvalInstance. I.e you should get:

[] TTreeFormula formula("formula", "Evt->trks[].dir.x", Tree);
[] Tree->GetEntry(0)
(int) 1569
[] evt->trks[0].dir.x
(double) -0.71326228
[] formula.GetNdata()
number of elements in `trks` (times the number of element in `dir` and `x`, 1 if their are not collections)
[] formula.EvalInstance(0)
(double) -0.71326228

You are right. I am not sure what I was doing before, but when I do it again now, I do get the *.h and *.C files to give me the TTreeReader skeleton code. It looks interesting, but clearly I would need to read up on this some more before using it.

Thank you for the additional details on proper use. There might be a bug with EvalInstance()? Here is what I see:

$ root -l /path/to/my/file.root
[] gSystem->Load("/path/to/my/lib/lib.so");
[] Evt* evt = 0;
[] Tree->SetBranchAddress("Evt", &evt);
(int) 0
[] TTreeFormula formula("formula", "Evt->trks[].dir.x", Tree);
[] Tree->GetEntry(0)
(int) 2425

[] evt->trks.size()
(unsigned long) 2
[] evt->trks[0].dir.x
(double) 0.69604483
[] evt->trks[1].dir.x
(double) 0.56585223

[] formula.EvalInstance(0)
(double) 0.69604483
[] formula.EvalInstance(1)
(double) 0.0000000
[] formula.GetNdata()
(int) 2
[] formula.EvalInstance(1)
(double) 0.56585223

Before I call GetNdata(), only EvalInstance(0) “works”. After I call it, things behave as expected. It’s as if TTreeFormula was only “aware” it is pointing to a collection after I call GetNdata() for the first time.

Check out line 4035 of this PR:

So just to be super clear, this is known behaviour for TTreeFormula that point towards objects with dynamic size, and the fix is indeed to call GetNdata() before EvalInstance()? There is a scary “But the following fails” after line 4035 of that PR, and I can’t see what the issue is there.

Yes, you summarized well.
That’s a typo in the PR, I will fix it, thanks for noticing!

Ok, combining everything into a single post to flag it as the solution.

If I have a TTree with custom structs such that:

  • Event is a Evt and the top level branch (Evt is a custom struct),
  • trks is a vector<Trk> member of Event (Trk is also a custom struct),
  • dir is a Vec member of trks (Vec is also a custom struct),
  • x is a double member of dir,

then I can access the sub-branches with TTreeFormula as follows (showing also with SetBranchAddress for comparison):

[] gSystem->Load("/path/to/my/lib/lib.so");
[] TFile f("/path/to/my/file.root");
[] TTree* Tree = (TTree*) f.Get("Tree");

[] Evt* evt = 0;
[] Tree->SetBranchAddress("Event", &evt);
(int) 0
[] TTreeFormula formula1("formula1", "Event->trks[0].dir.x", Tree);
[] TTreeFormula formula2("formula2", "Event->trks[].dir.x", Tree);
[] Tree->GetEntry(0)
(int) 1569

[] evt->trks[0].dir.x
(double) 0.69604483
[] formula1.EvalInstance()
(double) 0.69604483

[] evt->trks.size()
(unsigned long) 2
[] evt->trks[0].dir.x
(double) 0.69604483
[] evt->trks[1].dir.x
(double) 0.56585223
[] formula2.GetNdata() // you have to call this first if pointing to dynamic size object
(int) 2
[] formula2.EvalInstance(0)
(double) 0.69604483
[] formula2.EvalInstance(1)
(double) 0.56585223

and I am pretty sure the -> could have been a . in the TTreeFormula.

1 Like

Ok, so I can’t edit the solution anymore, but I have to amend it: you need to call GetNdata() if the “path” in TTreeFormula includes any dynamic-sized object, even if you fix the indices in the formula. It only works as-is if all the indices are zero.

That is, TTreeFormula formula("formula", "Event->trks[0].dir.x", Tree) works out of the box, but
TTreeFormula formula("formula", "Event->trks[1].dir.x", Tree) will return zero when evaluated (or N/A once the linked PR gets merged, I assume). You have to call GetNdata() first, even though the formula will always have an Ndata == 1. Does this qualify as a bug?