Subclasses in TBranch

gfronze · June 2, 2017, 3:10pm

Dear fellows ROOTers,

I would like to get a clarification from you about the usability of subclasses in the TBranch-es of a TTree.
Actually I have a parenthood of classes like the following:

Mother
- Daughter1
- Daughter2
- Daughter3

I want to use a TBranch with the address set to a Mother* pointer, because I need to fill the TBranch with heterogeneous objects of types Daughter1, Daughter2 and Daughter3. Mother contains an enumerator that defines to which sub-class the object belongs (this allows for a recast of the retrieved value).
I am trying to get the values I have stored in the tree, but apparently the values have been saved as Mother-s loosing the additional information taken by Daughter-s classes.
How should I handle this situation? Why isn’t the whole data saved to the TBranch?

Thank you in advance for your help!

mschille · June 2, 2017, 10:10pm

Hi,

I’m writing this as non-expert in ROOT I/O, so others may have more accurate information, so take my musings with a grain of salt…

I don’t think you can just write polymorphic objects to a branch in a TTree, because if you were to write a “class, or any its possible subclasses” from memory to disk, you’d run into the problem that they would most likely have different data members (and thus require different amounts of space on disk - worse, a given chunk of memory likely refers to completely different data members in different subclasses that need to be saved differently!). Therefore, this simple solution of just writing to a single branch doesn’t quite work, I think, because ROOT’s I/O subsystem, while smart, isn’t quite that smart. (I’d guess this is by design: the CPU and I/O overhead to allow polymorphic tree branches is likely so high that the ROOT developers thought it not worth the effort to implement this feature, as it would slow the common use cases down to an unreasonable degree).

That said, I think there are a couple of possible workarounds that I’d imagine would work (although I haven’t tried yet):

you try to do without inheritance in your on-disk representation, thus sidestepping the entire issue
(likely a wise idea anyway - this would be my preferred solution)
you could have a separate branch for each subclass (sometimes filled, sometimes not), and another branch that’s just a pointer-to-base-class that is set accordingly (and points to the right derived object in each event)
you could try to sort of “roll your own” inheritance by defining some kind of class that has a tag to indicate the subclass (e.g. an enum) and enough storage for each of the data members in the derived classes to be saved (usually a union of structs, although, again, I’m unsure how well ROOT’s I/O will deal with unions); unions are a dirty trick left over from the bad old C days and allow you to hold any one of the contained objects while only allocating space for the largest contained objects (i.e. it’s a way how to represent one of several alternatives compactly)

The last item is clearly the most involved, least portable, most troublesome for the compiler, and the one most likely to cause trouble for ROOT’s I/O subsystem. Therefore, my advice would basically boil down to: Revise you class/inheritance structure such that each branch holds only one specific type (i.e. a class, not its subclasses, nor its superclass). Then, ROOT’s I/O subsystem is happy, and you’ll also be happy because things will work as expected.

I hope these ideas will help (at least a bit… ;)).

Cheers,

Manuel

pcanal · June 3, 2017, 12:41pm

Hi,

ROOT I/O can support inheritance perfectly except in one case, the case where the top level branch is both split and using polymorphism. When the branch is split, it creates a fixed subset of branches which match only what it knows (the base class). To support polymorphism for a top level object you have two choice. One is to disable the splitting for that branch.

tree->Branch(branchaname, &ptr,  32000, 0 /* disable splitting */);

The other is to use std::vector<Base*> and request the special split level greater than 100 (see ‘case E’ in https://root.cern.ch/doc/master/classTTree.html) e.g. if it contains objects of any types deriving from TTrack this function will sort the objects based on their type and store them in separate branches in split mode. i.e. in this case each derived class will have its own set of sub-branches.

tree->Branch(branchaname, &ptr2vector,  32000, 199 /* enable splitting of collection of pointers*);

Cheers,
Philippe.

gfronze · June 3, 2017, 4:07pm

Dear Manuel,

Thank you for you brainstorming!
I agree with you on all your explanation, but I knew ROOT was able to handle subclasses in TBranch, so I wanted to know how to use that feature without changing my classes!
For the moment I will stick with Philippe’s suggestions, but if they won’t work I will get back to your ideas for sure.
Cheers!

Gabriele

gfronze · June 3, 2017, 4:08pm

Dear Philippe,

Thank you so much for your precise answer, I will try your first solution first.
Have a nice day!
Cheers!

Gabriele

mschille · June 4, 2017, 9:00pm

Dear Gabriele,

okay, I clearly underestimated the clever things that ROOT’s I/O subsystem can do - my apologies to you, and also to Philippe!

Good luck, and all the best,

Manuel

gfronze · June 7, 2017, 2:25pm

Dear Philippe, All,

I have managed to handle my subclasses within the same branch! Thank you so much for your help!

Now I have incurred in an additional problem. Infact I would like to “modify” the data members of those items, and from what I got TTree does not allow that, even if the needed space is already allocated.

Have you any advice on the best practice to do this kind of modification of data?
Am I forced to do a clone of the TTree, modify it and rewrite it on the file (overwriting)?

Many thanks once again!

Gabriele

pcanal · June 7, 2017, 2:27pm

What is the semantic of the modification? How often would those modification needs to be made?

Cheers,
Philippe.

gfronze · June 7, 2017, 2:35pm

As I said before I have etherogeneous classes:

Some items have run number and beam type info (and much more!) (class A,B,C)
Some items don’t (class D)
Once the entries have been sorted WRT timestamp I would like to assign the run number and beam type information to the entries without them (D) that have been recorded (chronologically) between items containing that kind of data (A,B,C).

In a similar way I when I have two “A” items at different time stamps I want to interpolate one of their data members and assign to the elements of type “D” (chronologically) between them the interpolated value.

The modifications have to be made once per entry: after that the algorithm will avoid additional modifications since every data member is set and a flag is then raised.

Cheers,
Gabriele

pcanal · June 8, 2017, 2:14pm

Can that be done ‘during’ the writing of the sorted TTree?

Cheers,
Philippe.

gfronze · June 12, 2017, 11:52am

Dear Philippe,

Thanks again for your help!
The last week I had the opportunity to brainstorm a little bit on this with dpiparo
Now I have successfully handled the problem by doing a one-by-one copy of the TTree while, on the fly, modifying the entries, thanks to the ordering routines.
With a slight mod of the whole algorithm I have managed to pack all the accesses in one to reduce the number of re-writes.
At the end I am able to create a copy of each TTree with same name and title and rewrite it on the same file with TObject::kOverwrite flag: to the user nothing changes but the datamembers!

I’ll be looking further on a “disk buffered container” by the way, I think that can be a nice addition!

Cheers,
Gabriele

system · June 26, 2017, 11:52am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.