Issues with RDataFrame if name and leaflist of a TBranch are different


Please read tips for efficient and successful posting and posting code

ROOT Version: 6.20
Platform: MacOS
Compiler: Clang


Dear ROOT developers,

I do not know wether this is a known behavior, or even something trivial. I was working on some TTrees when I noticed that RDataFrame shows some issues working with branches whose name and leaflist do not coincide, that is, for instance:

    TTree tree ("tree", "tree");
    int y;
    tree.Branch ("x0", &y, "x0/I");             // Good
    tree.Branch ("x1", &y, "z1/I");             // Not good

In particular, importing the TTree as a RDataFrame will produce two columns named respectively “x0” and “x1.z1”. This leads to errors using functions like Snapshot or Histo1D.
I attach a short macro which shows the problem. It creates a TTree and saves it to a TFile. Then it opens that TFile both in the standard ROOT way and with RDF, prints the columns’ names, draws a histogram of the variable x1 and saves the file again.

void CreateFileAndTree()                // Create a dummy file and TTree 
{
    TFile file ("dummy.root", "RECREATE");
    TTree tree ("tree", "tree");
    int y;
    tree.Branch ("x0", &y, "x0/I");             // HERE! x1 == x1 --> OK
    tree.Branch ("x1", &y, "z1/I");             // HERE! x1 != z1 --> NO --> Problems with snapshot

    for (y = 0; y < 100; y++) tree.Fill();

    tree.Write();
    file.Close();
}

void ttree_Macro()
{
    CreateFileAndTree();                                                        // Create a dummy file and TTree

    // --- Standard ROOT
    // Read Tree
    TFile *fileIn   = new TFile ("dummy.root", "READ");
    TFile *fileOut  = new TFile ("dummyROOT.root", "RECREATE");
    TTree *tree;
    fileIn  -> GetObject("tree", tree);  

    // Print info
    tree    -> Print();                                                         // Print info, including column names

    // Draw a histogram
    tree    -> Draw ("x1");                                                     // Draw a histogram - WORKS!


    // Save Tree
    tree    -> Write();
    fileOut -> Close();
    fileIn  -> Close();



    // --- RDataFrame
    // Read Tree
    ROOT::RDataFrame d ("tree", "dummy.root");

    // Print info
    for (auto && elem : d.GetColumnNames()) std::cout << elem << std::endl;     // Print columns names

    // Draw a histogram
    auto h = d.Histo1D("x1");
    h -> Draw();                                                                // Draw a histogram - DOES NOT WORK

    // Save Tree
    d.Snapshot("tree", "dummyRDF.root", d.GetColumnNames());                    // WORKS!
    d.Snapshot("tree", "dummyRDF.root");                                        // DOES NOT WORK!

}

I am looking forward to your reply.
Best regards,
Loris

Hi Loris,
thanks for the report with simple reproducer. Indeed this looks like a bug, we have tickets with variations on this theme but I don’t think we have this one exactly, unless I missed. In any case, can you please open a jira ticket about this?

Cheers,
Enrico

1 Like

This is now ROOT-10625 (thanks @Loris). Discussion continues there.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.