Streamer for inherited variables

bozzochet · September 29, 2013, 10:17am

Dear ROOTERS,

I have a question that, even “studying” a lot the manual (AddingaClass, Trees and InputOutput) and googling I was not able to solve by myself…

Suppose I have a class ‘Event’ written on a file in a TTree.
This class Event has collection of ‘Track’ objects containing, each one, informations about a particular track in the event.
Then suppose I want to have another collection of TrackPlus objects containing much more informations wrt the previous one.
Now: the TrackPlus class is “daughter” of the Track class but a TrackPlus object is added to the collection only for certain events/tracks (if this track is very interesting). We decided to make TrackPlus daughter of Track since for the interesting tracks the user can use ONLY the TrackPlus (with much more variables) without continuously accessing the Track one for the ‘missing’ variables.
Of course in terms of disk space this is a waste of space: if for an Event we write the TrackPlus for track number M then, essentially the Track number M is written two times: in the first collection as Track and in the second (shortest) collection as TrackPlus…

So I was wondering if is possible, for Track class, to skip the writing of the members if the object written is not Track itself but one of its daughter, as is done for TObject using the IgnoreTObjectStreamer() method… Then of course I would like to be able also to fill back, for TrackPlus, the Track variables via, for example, a rule in dictionary (or something similar: for example writing the rule to access the corresponding Track object in the Event collection of Track objects in the default constructor…), once reading it from disk…

The whole picture could seem very confused but I can guarantee that in the schema we are using it makes definitely sense since this splitting is used in a very smart (I think…) way…

Thanks a lot,
Matteo

pcanal · September 30, 2013, 4:27pm

Hi Matteo,

If you main collection is a vector<Track*>, you can store both Track and TrackPlus in it. If in addition you wan the collection to be split in the TTree, you can set the split level to a number higher than 100 and internally the Track and the TrackPlus will be stored in two set of (hidden) sub branch with no data duplication and fancy back filling.

Cheers,
Philippe.

bozzochet · September 30, 2013, 5:39pm

Dear Philippe,

so far we are using vector but we’re thinking to change it exactly because otherwise no polymorphism is deployable… But this is not the point…
The point is that I would like to have the ‘Track part inside TrackPlus’ (being daughter of Track) not written since the very same Track (now simply as Track itself) is written also. I create the TrackPlus with the (default) copy constructor:

TrackPlus::Track(const Track& father, bool ignstream):Track(father){
  if (ignstream) TrackPlus::Class()->IgnoreTObjectStreamer();
}

where the ‘father’ also is written on disk.
If you instead mean that if the TrackPlus would have a pointer to its ‘father’ (instead of being its daughter…) and then it will be automatically written just once and then ‘reconstructed’ back during reading, I know but:
[ul]

However I would have not the benefits of the inheritanche (essentially I would have continuosly to retrieve the variables from the father pointer)
I’m crazy (:)) and I put “//!” in the vectors and then I ‘stream’ the vectors on different files… As far as I know the only way to make it “working automagically” (in reconstructing the references) is using the TRef but I tried and I observed that it works but there’s a lot of space “wasted”
[/ul]

Thanks a lot!
Matteo

pcanal · September 30, 2013, 6:08pm

[quote]so far we are using vector but we’re thinking to change it exactly because otherwise no polymorphism is deployable… But this is not the point…[/quote]Actually, in my opinion, it is central to the point. The problem will be much less academic if/when you have decide how the two part are being shared in memory.

For example, you could actually have the reverse idea, where the Track has an extra pointer, often set to zero, pointing to extra parameters.

If you decide for vector<Track*>, then the differences in behavior and rebuilding is handled automatically by the system.

Or you could have simple reference that are index into the other collections, etc…

TrackPlus::Track(const Track& father, bool ignstream):Track(father){ if (ignstream) TrackPlus::Class()->IgnoreTObjectStreamer(); }seems to be a real problem because the parameter seems to indicate you want the “skip my parent” behavior to be local to this object but the implementation make it global to all objects.

Cheers,
Philippe.

bozzochet · September 30, 2013, 8:37pm

Dear Philippe,

thanks a lot for the reply but I must admit that I failed to understand everything from your last reply. Mosto probably I’m not so skilled as I think!

Let’s start saying that if this can allow me to do what I’m asking in this topic this is a good reason to move to vector of pointers. In addition there are several points, so far, where the impossibility to use polymorphism was annoying me…
So we can assume that I have vectors of pointers.

I didn’t understand, however, how I can “save” the space if I do something like:

std::vector<Track> vt;
std::vector<TrackPlus> vtp;

Track* t = new Track();
vt.push_back(t);

TrackPlus* tp = new TrackPlus(*t);
vtp.push_back(tp);

once I stream the two vectors in the TTree, the TTrack variables are written twice: once in the object referred by t and once in the “father part” of the object pointed by tp, isn’t it?

Then I didn’t understand your comment about the copy constructor: you are referring to the IgnoreTObjectStreamer()? I put this in TrackPlus if at a certain moment I will need something like TrackPlusPlus but the very same syntax is used in Track class: I saw in the manual (but I also tested) that, to really avoid to stream the fUniqueID and fBits variables from TObject, you need to call IgnoreTObjectStreamer() just in the object really to be streamed and not in the father class (so in this case is called to TrackPlus but not for the father (Track).

Essentially I would like to have the possibility to write a method like IgnoreTObjectStreamer() for my Track class to avoid to stream some variables (as I do with //! but callable at runtime).
Then I would need another mechanism to re-fill the Track father part once reading back the stuff from file…

Thanks a lot for your replies and your comment,
Matteo

pcanal · September 30, 2013, 8:54pm

[code]std::vector vt;
std::vector vtp;

Track* t = new Track();
vt.push_back(t);

TrackPlus* tp = new TrackPlus(*t);
vtp.push_back(tp);[/code]My main point is that here, if TrackPlus simply inherit from Track, you are ‘wasting’ space in memory at run-time and duplicating the information. This is already a problem in my opinion.

[quote]to really avoid to stream the fUniqueID and fBits variables from TObject, you need to call IgnoreTObjectStreamer() [/quote]Yes, but unfortunately it is a global state, that affect all objects of that type.

And to answer your direct question, there is no easy to do it short of writing a custom streamer which would also disable the ability to split the object of this class. But, yes we could find a way to ‘make it work’. However I do not think it is the right direction to go to and would be costly in development time and maintenance. You might find that if you find the ‘correct’ solution to avoid the in-memory data duplication represented in the code example above, you will also solve the streaming issue.

[quote]So we can assume that I have vectors of pointers.[/quote]Then you are done . Store the Track and TrackPlus in the same collection and either do not split the collection when storing in a TTree and split it with a split level greater than 100 (which will allow the splitting of the vector of Track*).

Cheers,
Philippe.

bozzochet · October 1, 2013, 7:00am

Dear Philippe,

thanks again!

Yes, probably I didn’t explain very well: I really want the variables coming from TObject to be skipped, for every object in my TTree…

You mean that if I make the TrackPlus to BE a Track the problem is unavoidable. If instead I make TrackPlus to HAVE a Track (via a pointer) I would solve the problem, correct?

For the splitting I sure have the split level greater than 100 but, as I said I also stream the vectors on different files not to “manually” drive the splitting (would be very ineffective) but since its the main design of the story: split the data files for modularity either in terms of total size (if you don’t need everything, for the analysis you’re carrying on…) either in terms of “cost” if we need to reproduce just the part related to one detector keeping the very same rest…

Thanks again,
Matteo

bozzochet · October 1, 2013, 7:23am

Dear Philippe,

I have another question. You wrote several times “splitlevel greater than 100”. Can you confirm this? I mean: in the reference guide I “learn” that each “splitlevel” level allows you to split an additional level of sub-classing, up to the final single variables. After writing to you I checked my code and I realized I left the default parameters 99 (both in the TTree constructor, both in the TTree::Branch()). Do I need to increase this value to a greater than 100 one?

Thanks a lot,
Matteo

pcanal · October 1, 2013, 12:12pm

Hi,

Yes, this is a special mode for collections of pointers. See http://root.cern.ch/root/html/TTree.html:[quote] TBranch *branch = tree->Branch( branchname, STLcollection, buffsize, splitlevel);
STLcollection is the address of a pointer to std::vector, std::list,
std::deque, std::set or std::multiset containing pointers to objects.
If the splitlevel is a value bigger than 100 (TTree::kSplitCollectionOfPointers)
then the collection will be written in split mode, e.g. if it contains objects of
any types deriving from TTrack this function will sort the objects
based on their type and store them in separate branches in split
mode.[/quote]

Cheers,
Philippe.

bozzochet · October 1, 2013, 12:30pm

Thanks a lot,

this is written in the TTree comments but I din’t find it in the Root Guide!

Thanks a lot for the hint!

Matteo