Dear ROOTers,
I am trying to use (ROOT 6.05/03/somecommit, but also colleagues using 6.04 releases) the TTreeReader facility with a data structure that resembles the “Event” class example (in split mode). For the purpose of this discussion the data can be simplified as follow (but I can provide a reduced data file if important):
// N.B. ClassDef macros present in all classes, here omitted for clarity
class Event : public TObject {
// [...] Some stuff, but in particular:
std::vector< Peak > peaks;
std::vector< Interaction > interactions;
}
class Peak : public TObject {
// [...] This is a large object with a lot of basic data types e.g.:
Float_t area;
// ... but also another, nested vector (N.B. This is NOT further split in the current version of the TTree)
std::vector< ReconstructedPosition > reconstructedposition;
}
class Interaction : public TObject {
{
//Only basic data types, in particular:
Int_t s2; //index of the related element within the "peaks" vector
}
//the class ReconstructedPosition has only basic data types
All these class definitions are contained in “classes.{hpp|cpp}” files that are provided together with the root files. I compile them in a library “classes.so” once forever in a separate root session with ’ gSystem->CompileMacro(“classes.cpp”,“kf”,“classes”) '. Then I use ’ gSystem->Load(“classes.so”) ’ to load the library just after starting the root session.
Then I access the data with a compiled script (via “.L myscript.cc++”) that looks like:
TTreeReader myreader(tree);
TTreeReaderArray<Interaction> interactions(myreader,"interactions");
// EACH TIME ONLY ONE OF THE 3 OPTIONS BELOW:
TTreeReaderArray<Peak> peaks(myreader,"peaks"); //--> OPTION 1
TTreeReaderArray<Float_t> areas(myreader,"peaks.area"); //--> OPTION 2
TTreeReaderValue< std::vector<Peak> > peaksvec(myreader,"peaks"); //--> OPTION 3
//THEN I ACCESS THE OBJECTS E.G. WITH
while ( myreader.Next() ) {
for (const auto& interaction : interactions) {
cout << "S2 id " << interaction.s2 << "\t\t area "
// << peaks[interaction.s2].area << endl; //--> FOR OPTION 1
// << areas[interaction.s2] << endl; //--> FOR OPTION 2
<< (*peaksvec)[interaction.s2].area << endl; //--> FOR OPTION 3
}
}
I observe that the TTreeReaderArray has no problem to iterate over the “Iteration” vector, while with “Peak” it starts to get garbage after some event, so I have to stick to OPTION 3.
Q1) Am I doing something wrong or is there a bug with TTreeReaderArray (probably known… I found this report sft.its.cern.ch/jira/browse/ROOT-7581) ?
Q2) Is a “linkdef” file, with the directives to generate dictionaries for all the std::vector, required at the moment of the compilation of the “classes.so” library? Would this actually solve my problem? (However I don’t get any hint message during compilation)
Q3) Is actually OPTION 2, i.e. accessing to a sub-branch of vector of objects, supported? The documentation is not really explicit… I tried, and it worked … for few events just like option 1 (at the beginning I thought that the problem was with option 2). If option 2 is not supported, it seems to me that option 1 entails some extra I/O cost due to the reading of the whole object… am I wrong?
Q4) The workaround of OPTION 3 works, but seems even more expensive in terms of I/O because you have to load the full vector of objects… can you comment on that?
Some other generic but related questions:
Q5) Has in general this TTreeReader approach some (non-negligible) overhead over the old way of manually “SetBranchAddress()” or even GetEntry() on single branches instead of the full tree?
I would like to benchmark the different approaches, and benchmarks are always a bit tricky with the details. Q6)Can I follow the example of $ROOTSYS/test/MainEvent.cxx with the TStopwatch class? Any other detail that can matter?
Q7) In the documentation TClonesArray is described as superior in terms of performance over other arrays of objects. Still it seems to me (I can be wrong) that std::vector is preferred by most of the people and I wonder if, due to the large number of users, you optimised the I/O of std::vector and TClonesArray is not a must.
Thanks in advance for going through this big bunch of questions,
Matteo