TBranch cannot return TVector* or TLorentzVector*

jonperkin81 · January 28, 2012, 9:09pm

Hi,

I am having a problem similar to the one described here:

It looks like neither the a TVector3* or a TLorentzVector* can be read from a TBranch, in the same way that a bool* couldn’t, however a bus error is returned this time.

An object containing an array of vectors:

class Data : public TObject
{
public:
  int            nBool;
  bool           bArray[_NBOOL_];
  int            iArray[_NBOOL_];
  TLorentzVector lArray[_NBOOL_];
  TVector3       vArray[_NBOOL_];

  ClassDef(Data,1);
};

is written to a TTree. The following works in a .C macro:

  gSystem->Load("boolArray_C.so");

  TFile *f = TFile::Open("boolArray.root");
  TTree *t = (TTree*)f->Get("boolTree");

  Data *d = new Data();
  t->SetBranchAddress("data",&d);
  t->GetEvent(0);
  
  for(int i=0; i<d->nBool; i++)
    {
      cout<<"iArray["<<i<<"] = "<<d->iArray[i]<<"\tbArray["<<i<<"]  = "<<d->bArray[i]<<"\n";
      d->lArray[i].Print();
      d->vArray[i].Print();
    }

but not in a .py script:

import ROOT

ROOT.gSystem.Load("boolArray_C.so")

f = ROOT.TFile.Open("boolArray.root")
t = f.Get("boolTree")
t.GetListOfBranches().Print()

d = ROOT.Data()
t.SetBranchAddress("data",ROOT.AddressOf(d))

t.GetEvent(0)

for i in xrange(d.nBool):
    print "iArray[%d] = %d\tbArray[%d] = %d" % (i,d.iArray[i],i,d.bArray[i])
    d.lArray[i].Print()
    
f.Close()

Macros attached.
Any help, much appreciated!
readBoolArray.py (433 Bytes)
readBoolArray.C (580 Bytes)
boolArray.C (1.22 KB)

wlav · January 31, 2012, 7:43pm

Hi,

I don’t think that this is going to work: the dictionary presents the arrays as TLorentzVector* and TVector3*, which is largely fine in C++ since an array and a pointer are mostly the same thing. In python, however, they are not. Any TLorentzVector* will be presented as a TLorentzVector object, never an array and without extra information available stating that it really is an array, I don’t think it can be done …

(And yes, the code shouldn’t crash, but if the array case can’t be detected, then neither can an error message be produced.)

Cheers,
Wim

jonperkin81 · February 1, 2012, 2:54pm

but that would suggest that any TTree containing an object that owns an array of data of non-fundamental types, cannot be read. Surely not?

wlav · February 1, 2012, 3:45pm

Hi,

not in python, no. To be sure, this is the first time it came up, though: there are general recommendations against using objects with base classes in builtin arrays (since the stride is wrong when iterating over them with a base class pointer), and it’s therefore more common to use std::vector<>s.

For the specific case of use with a TTree, it may be possible to define a new extension object which is then used with SetBranchAddress, but I don’t think it is generally solvable.

Cheers,
Wim

jonperkin81 · February 1, 2012, 8:32pm

Thanks for the info Wim.

I’ll have to try and encourage my collaborators (T2K) to use STL containers if we want our analysis files to be compatible with python.

Cheers

Jon

jonperkin81 · February 2, 2012, 4:46pm

Hi again Wim,

I’m still a little unclear as to whether we’re looking at a fundamental incompatibility between C++ and python, whether this is ROOT/PyROOT issue or whether arrays of base classes are fundamentally a flawed concept.

Since ROOT permits the writing of built-in/fixed-length arrays of base class instances to a TTree (encapsulated in an object or otherwise) then one would naively expect PyROOT to support the same features.

To get round my present issue with analysis files of such objects I have had to write a utility class that takes an object, itself containing instances of base class objects in fixed arrays, and fills STL vectors with those instances such that they can be accessed in via PyROOT. One could envisage such a thing being incorporated into PyROOT.

If indeed the practice of holding base class objects in an array is fundamentally flawed, perhaps TTrees should not support this feature, or at least warn against it.

Any further insight much appreciated!

Jon

wlav · February 2, 2012, 6:00pm

Jon,

on native arrays, it’s an entry in Scott Meyers’ in one of Effective C++ or More Effective C++ books (they’re at home, and it has been a while). The basic point is that one has base classes in order to refer polymorphically to the derived object through base class pointers. However, if you loop over a native array through a base class pointer, the stride will be wrong as sizeof(base) does not need to be sizeof(derived). If you always refer to the objects in the native array by the derived class only, this is a non-issue.

Note that there are plenty of ways in C++ to do things that can cause hurt. Requiring TTrees not to support these would be quite an undertaking.

As for the exact problem in detail as concerns here: the dictionary hands the type of the array as a TLorentzVector* rather than a TLorentzVector[]. Because of that no distinction can be made when generating the binding. There are a few other such shortcomings in the dictionary (e.g. with the ‘using’ statement). In C++, not making this distinction is perfectly fine: the user code can treat both the same and with indexing, will decide which is which. In python, it’s not: the former would be an object, the latter would be an array.

Thinking about it some more … Technically, it would be possible to always generate a getitem method for all classes, that pre-emptively assumes that the pointer can also be used as an array. Then if it is an array, let it up to the user to do just that.

However, all these problems should be solved if/when we move to LLVM, and so I’d rather focus any such spare time as I have on that.

Cheers,
Wim