Reading TClonesArray in JITted RDF

Hi,
I am trying to read Delphes trees (with split TClonesArrays) in RDataFrame with JITting, so I think my problem is the JITted version of

This fails:

In [7]: df.Define("leadElPT", "Electron[0].PT")                                                                                                                                                                     
input_line_70:2:53: error: member reference type 'TObject *' is a pointer; did you mean to use '->'?
auto lambda0 = [](TClonesArray& var0){return var0[0].PT
                                             ~~~~~~~^
                                                    ->
input_line_70:2:54: error: no member named 'PT' in 'TObject'
auto lambda0 = [](TClonesArray& var0){return var0[0].PT
                                             ~~~~~~~ ^
input_line_74:2:53: error: member reference type 'TObject *' is a pointer; did you mean to use '->'?
auto lambda0 = [](TClonesArray& var0){return var0[0].PT
                                             ~~~~~~~^
                                                    ->
input_line_74:2:54: error: no member named 'PT' in 'TObject'
auto lambda0 = [](TClonesArray& var0){return var0[0].PT
                                             ~~~~~~~ ^
---------------------------------------------------------------------------

(very similar errors with Electron[0]->PT, then it’s just the missing attributes instead of . for a pointer).
This works, but it’s a bit cumbersome…

ROOT.gInterpreter.Declare("using delphes_electron = Electron;")
df.Define("leadElPT", "dynamic_cast<delphes_electron*>(Electron[0])->PT")

If there’s an easy way to define RVec<Electron> or RVec<Electron*> columns that’s probably good enough as a workaround, but my first attempts to find something for that failed.
Do you have any suggestions?

Thanks,
Pieter


ROOT Version: 6.24/00
Platform: LCG_100 (x86_64-centos7-gcc10-opt) on lxplus
Compiler: GCC 10


Hi Pieter,
if I understand correctly RDF correctly reads Electron as a TClonesArray branch.
The problem is really that TClonesArray, being type-erased like all ROOT collections, has a clunky API: it requires that you downcast its elements from TObject to their actual types.
That’s fairly different from Opening TClonesArray from Delphes Branches with RDataFrame where RDF was not able to read TClonesArray branches at all (that was an I/O problem, this is an API problem).

Now naturally, you’d like RDF to perform an on-the-fly conversion from TClonesArray to a friendlier type such as RVec<Electron>. I think that would be a nice improvement – although I am not sure at this point if we could make that happen a) in a backward-compatible way and b) without a copy of the array elements.

As an aside you should be able to use a static_cast instead of a dynamic_cast for your workaround.

Can you please open a request for improvement at github.com/root-project/root/issues ? I will see what we can do about this.

Cheers,
Enrico

Hi Enrico,
Thanks for the explanation! I see the difference now. I looked a bit more into TClonesArray, and I think I can define RVec<Electron*> columns (copying the pointers, or maybe even reusing the underlying array) with a helper function in the framework or analysis code, so that should be fine as a solution. I’m also checking with my colleagues because it’s not clear yet if we will use the Delphes trees directly or a reduced flat tree format, so it’s not very urgent; I’ll open an issue in any case.
Thanks,
Pieter

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.