Inconsistent behaviour of RDataFrame ColumnTypes

Dear experts,

We are observing an inconsistent behaviour when reading an RDataFrame column from a TTree or when redefining it. For instance, if the type of a TBranch is say v[3]/F, when reading the TTree the column type is ROOT::VecOps::RVec<Float_t>, but when redefining the column it becomes ROOT::VecOps::RVec<float>. The full example to reproduce this is posted below.

This is annoying when performing Vary on multiple columns, in case some of them come directly from the TTree while some others are instead the result of redefinitions.

A dumb workaround is to redefine every input column to itself, which is of course not ideal and I expect it also brings some overhead.

Any help would be much appreciated.

Many thanks,
Federico

#include <ROOT/RDataFrame.hxx>
#include <ROOT/RVec.hxx>
#include <TFile.h>
#include <TTree.h>

int create_root_file()
{
        TFile* f = TFile::Open("tree.root", "RECREATE");
        TTree *t = new TTree("tree", "tree");
        Float_t v[3] = {0., 1., 2.};
        t->Branch("v", &v, "v[3]/F");
        t->Branch("w", &v, "w[3]/F");
        t->Fill();
        t->Write();
        f->Close();
        return 0;
}


int main()
{
        create_root_file();

        ROOT::RDataFrame df_start("tree", "tree.root");
        auto df = df_start.Filter("true");
        auto t1 = df.GetColumnType("v");
        df = df.Redefine("v", "v");
        auto t2 = df.GetColumnType("v");
        std::cout 
                << t1 << " from TFile\n"
                << t2 << "   from Redefine(...)\n"
                << typeid(float).name() << "\n"
                << typeid(Float_t).name() << "\n"
                << typeid(ROOT::VecOps::RVec<float>).name() << "\n"
                << typeid(ROOT::VecOps::RVec<Float_t>).name() << "\n"
                << ROOT::RDF::RDFInternal::TypeID2TypeName(typeid(ROOT::VecOps::RVec<float>)) << "\n"
                << ROOT::RDF::RDFInternal::TypeID2TypeName(typeid(ROOT::VecOps::RVec<Float_t>)) << "\n"
        ;
        // uncomment the line below to get the error
        // df = df.Vary({"v", "w"}, [] (const ROOT::RVecF& v, const ROOT::RVecF& w) { return ROOT::RVec<ROOT::RVec<ROOT::RVecF>>{{v * 0.9, v * 1.1}, {w * 0.9, w * 1.1}}; }, {"v", "w"}, {"down", "up"}, "variation");
        return 0;
}

Detail of the used ROOT version below:

   ------------------------------------------------------------------
  | Welcome to ROOT 6.32.06                        https://root.cern |
  | (c) 1995-2024, The ROOT Team; conception: R. Brun, F. Rademakers |
  | Built for linuxx8664gcc on Oct 01 2024, 10:44:42                 |
  | From tags/6-32-06@6-32-06                                        |
  | With g++ (Alpine 13.2.1_git20240309) 13.2.1 20240309             |
  | Try '.help'/'.?', '.demo', '.license', '.credits', '.quit'/'.q'  |
   ------------------------------------------------------------------

Welcome to the ROOT Forum!
I’ll let @vpadulan comment on this

Dear @ferri ,

Thank you for the reproducer, I will take a look soon.

Cheers,
Vincenzo

1 Like