Home | News | Documentation | Download

Checking if TTrees with unsplit cpp classes as branches are identical


#1

Is there a generic way to compare two TTrees containing unsplit cpp classes to see if the two are identical? I want to avoid hard coding in the branches and their data members in order to make it possible for some script to loop over and check. Also some of the branches might be vectors of the cpp classes.

Here are some approaches I am considering but am not aware of how to implement:

Consideration 1) Using GetListOfBranches to loop over the main TTree branches and then using some other function to get the class type of that branch and another function to get a list of all the data members of that class type. If the data member is itself a class with data members, then this process should repeat until one is looping over simple variables.

Consideration 2) Make a script that takes the unplit TTree and copies it with split-level=99. Then one can simply loop over the fully split TTree with GetListOfBranches and compare branches with matching names.

Any suggestions are appreciated.


ROOT Version: v6-14-00
Platform: macOS 10.12.6
Compiler: Apple LLVM version 9.0.0 (clang-900.0.39.2)



#2

See https://gitlab.cern.ch/swenzel/ObjectCmp


#3

or:


#4

Thanks @sbinet and @pcanal. These look like they should work. I will give them a try and get back to you if I have any questions.


#5

Hi again,
@pcanal: So the ObjectCmp tool is very useful for getting a yes or no answer about two files being the same. I am not sure how to interpret the output though. I see things like
FOUND NUMERIC DIFFERENCE FOR KEY TTree.fBranches{TObjArray}[10].fBasketBytes{int*}[5] ABSOLUTE -60
Is there a straightforward way to determine what variable this output is saying differs? Ideally I would like a script that would printout the name of the variables that differ.

@sbinet: The diff_root files also seems like it would work but how do you get setup to use it? It depends on PyUtils. I think it is the same thing as the diff_root command for acmd.py that is available after setting up and Athena Analysis release. However, my code is compiled against AnalysisBase and the diff_root command wont work unless class dictionaries are loaded.


#6

diff_root and PyUtils are (very) old code of mine, when ATLAS was still using CMT.
I must admit I don’t know how the ATLAS runtime is setup now with all the “new” CMake stuff.
You may want to bug ATLAS people for this.

The root-diff command is self-contained and shouldn’t even need dictionaries (just the streamers that are embedded in the ROOT file.)


#7

It looks like you are using ObjectCmp to compare the TTree itself rather than its content. The output you should says that branch branch number ‘10’, the size of basket 5 is different.


#8

@sbinet
If I just setup an AthAnalysis release and run

lsetup 'asetup AthAnalysis,21.2.55'
acmd.py diff-root file1 file2

I get errors like the following:

  1. PyRoot: TClass::Init:0: RuntimeWarning: no dictionary for class Foo::Bar is available
  2. Error in <TBufferFile::CheckByteCount>: object of class vector<Foo::Bar> read too few bytes: 6 instead of 8
  3. input_line_72:1:28: error: use of undeclared identifier 'Foo' template class std::vector<Foo::Bar>;
  4. requested class 'Foo' does not exist

It ultimately crashes after that last error


#9

I am comparing the TTree yes. If I do not provide the TTree name as the “object name” then it dumps thousands of lines saying
Null object encountered; no data to analyse

Can you determine the variable name from the branch and basket number without looking through header files and counting things by hand?