I am attempting to use RDataframe for offline postprocessing , aka flattening those multi-dimensional arrays into 1D branches, for example:
// reading the 2D array as if the 2D array is unrolled into 1D in TTree, n_x+n_y*i
auto df2 = df1.Define( "ntrkhits_Uplane" , "ntrkhits_pandoraTrack[ntracks_pandoraTrack+3*0]" )
.Define( "ntrkhits_Vplane" , "ntrkhits_pandoraTrack[ntracks_pandoraTrack+3*1]" )
.Define( "ntrkhits_WZplane" , "ntrkhits_pandoraTrack[ntracks_pandoraTrack+3*2]" );
However, I have gotten errors:
terminate called after throwing an instance of 'std::runtime_error'
what(): TTree leaf ntrkhits_pandoraTrack has both a leaf count and a static length. This is not supported.
What is the correct way, or better way to read 2D array in RDataframe?
Hi @SiewYan ,
RDataFrame does not support 2D arrays well, because TTreeReader, which RDF uses internally, does not support 2D arrays well. The error you get is basically TTreeReader saying “I don’t know how to read this branch”.
But depending on the exact situation there might be a workaround – could you please share this data (even just a couple of events) with me so I can experiment a bit?
Hello @eguiraud , thank you for your answer, and please find the link [*] for your study. It would be great if there is a workaround on reading and manipulating 2D array in RDataframe.
Hi @SiewYan ,
sorry for the high latency, I can reproduce the problem but I could not find a workaround during my first investigation. I’ll give it another go asap.
~/S/w/forum_treereader_2dim_arrays root -l repro.C
root [0]
Processing repro.C...
Error in <TTreeReaderValueBase::CreateProxy()>: The branch ntracks_pandoraTrack contains data of type short. It cannot be accessed by a TTreeReaderValue<int>
So the only workaround I can propose would be to regenerate the file with a different size type (int or unsigned int) or to pre-process that file, without RDF, to do the conversion.
I realize that this is frustrating. You mentioned this file is generated from GEANT4, and we should be able to read files generated by GEANT. I’ll check how we can tackle this. FYI @Axel@pcanal .
@eguiraud thank you very much on taking the time to troubleshoot. Looks like this is the bottleneck for using RDF on my work; however, changing to int or unsigned int is not so trivial… (I reckon). I will need to refer to my colleague on the root file regeneration with the suggestion though.
On the other hand, it would be great if this is included in the future release.
I am open to other opinion if there is a mini-hack to get through it though.