Compare two TTrees for equality


How can I compare two trees without knowing their internal structure or datatypes?

I am trying to write a pyROOT utility to compare ROOT files written by our data analysis program. The goal of this utility is to make sure our code generates the same output for a given input as we refactor/add new features. Since we produce many different ROOT files, I would like to create one tool to compare them all in some meaningful way.

Comparing the ROOT files by their binary data doesn’t work, so I need to open the file and traverse the structure to do comparisons. I have been able to come up with a scheme to compare histograms and objects which implement operator==, but I have not been able to figure out how to compare two trees.

Is there some way to get the tree to compare the data it reads in? Maybe by working with TBranch? I see that I can access the data in some form with TBranch::GetAddress. This seems to be what is happening in PyROOT’s Pythonize.cxx with TTree (See PyROOT::TTreeGetAttr).

It seems as though I might be able to loop through the branches for each entry and compare the memory directly.

Any ideas?

Perhaps you get some inspiration from the new commands such as rootls (see implementation /main/python) that lists the objects in a root file. If a generic rootdiff could be developed would be great. This commands are written in Python.

I’m not sure if it completely makes sense, but it seems that if the leaves in a TTree compare equal for all entries, then the trees are equal. It is probably also the case that leaf data are basic types.

Is there an explanation anywhere of exactly how TTree and TBranch organize their data? If I have a TBranch, how can I acquire the leaf data? It seems that PyROOT is able to do this correctly. You can see it happening in Pythonize.cxx

I can do it for my tree too:

be =((TBranchElement*)tree->GetBranch("eventData")->GetListOfBranches()->At(2))
be->GetName() // yields: (const char *) "neutronPixY"
Long_t offset = ((TStreamerElement*)be->GetInfo()->GetElements()->At(be->GetID()))->GetOffset()
x = (Int_t*)(be->GetObject()+offset)
*x // yields: (int) 137
// ************************
// *    Row   * eventData *
// ************************
// *        0 *       137 *
// ************************

The question is how to traverse only the leaves.

And maybe a more important question is, does comparing leaf values conceptually make sense as an operator== for TTree? Are there situations where the leaf values could be the same but the trees are not truly equal?

Also, will leaves always be primitive types?