RDataFrame built from string branches


I am loading a TTree which has strings for all branches ("/C") in an RDataFrame.
I noticed that actually the column type is ‘ROOT::VecOps::RVec<Char_t>’.

Probably this is causing the several issues that I see:

  • Filters like df.Filter(‘var == “blabla”’) failing
  • df.Display(“var”, 10).Print() in python working but appending symbols like “$??*” at the end
  • df.Display(“var”, 10)->Print() in C++ complaining about missing shared libraries

The issue could be actually due to the branch type ("/C" instead of “std::string”)? Or do I really need to cast the ‘ROOT::VecOps::RVec<Char_t>’ objects to string for any action? Is the behaviour expected?

Thank you,

ROOT Version: 6.24.06
Platform: Centos7

I guess @eguiraud or @etejedor can help

Hi @sfrances ,

Yes to all of the above :slight_smile: In general RDF represents C-style arrays as RVecs (including C-style arrays of characters). If the column type was std::string then RDF would read it as a std::string.

The simplest workaround in v6.26 is a df.Redefine("var", "std::string(var.data(), var.size())") or an equivalent Define in v6.24.

I hope this helps!

Thank you so much for the clarification. I assumed at first that I was doing something illegal, so I asked. Perfectly understood!


This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.