RDataFrame accessing TString array branches of a split custom class

Dear RDataFrame developers,

after having solved the problem with the array branches described in this thread:

I found out that this TTree has also branches filled with an array of TStrings. I have problems to get access to these branches.
If I try something like

df.Foreach( [](RVecTString& det) { auto x = det[0]; } , { "peaks.detector" } );

I get the runtime error:

Error in <TTreeReaderArrayBase::SetImpl()>: Cannot read branch peaks.detector: unhandled streamer element type TStreamerString

followed by a segmentation fault.

Somehow this branch has also access difficulties with the old TTree::Scan(), which shows only and/or always the first element of the array in each event, while TTree::Show() gives correct output (and is the only independent confirmation that the TTree was correctly filled)

I attach again the same reduced-5-events-only root file of the previous thread
only5events.root (563.3 KB)


ROOT Version: 6.14/02
Platform: Debian 9
Compiler: gcc 6.3.0-18+deb9u1


1 Like

Hi,
thanks for the file, I’ll get to it as soon as possible and let you know what I find out :slight_smile:

Cheers,
Enrico

Alright, I can reproduce the error with TTreeReader (which is what RDataFrame uses internally for reading), and this looks like a bug on our part. I opened a bug report (ROOT-9813), please follow the progress on this issue there. These days we are working on the release of ROOT v6.16 and this bug might not make it to the top of the to do list before then – but feel free to ping us on the jira ticket in case this is a blocker for you and you don’t see any action after the release.

As a temporary workaround, you could read that branch with a TTree directly, and write out another TTree that contains std::vector<std::string> instead of TString[] arrays. Then you can register this new tree as a friend of the original one (via TTree::AddFriend) to read them together with RDataFrame.

Cheers,
Enrico

Thanks a lot, especially for proposing a temporary workaround.

If I get it right:
I have to use the old way of accessing the TTree with SetBranchAddress() etc. (I will look backward to some old code of mine), while maybe I can use the new RDataFrame::Snapshot() to produce the new root file.

However my data sample is in several root files (more than 300) and for the moment I “chain together” only 10 of them to build up my analysis.
So for each of these 10 root files I have to produce the new root file separately, and only when I have them all, I can follow the guidelines at https://root.cern.ch/doc/v614/classROOT_1_1RDataFrame.html#friend_trees (note I am using v6.14.02). The guidelines do not mention a TChain, but in principle it should work as well (since TChain is also a TTree).

I think I can live easily with this workaround during this phase, and maybe there will be a fix later.

Thanks again,
Matteo

Everything correct, except you will not be able to use Snapshot to write out the friend tree with the vector<string> (because RDF does not understand the type of the input branch, array of TStrings, yet)

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.