Home | News | Documentation | Download

Really solved: RDataFrame for TChain loaded with TTrees with different names?

Hi ROOT-Team,

I wonder whether the mentioned issue is really completely solved or if I am being being hit by something different:

Initially, I worked with ROOT v6.20.02 and experienced the same bug as described here:

I updated to ROOT v6.20.08 (where the bug should be solved) and if I remember correctly a check involving only a very finite number of events looked promising…

But now the problem is as follows:

This will get a little bit complicated, I apologize for not providing a minimal working example:

I have two TTrees with same different name, but in the same TFile. I create an ad-hoc TChain and build a RDataFrame up on it (pretty much as described in the initial bug report). These trees contain different event classes and are labeled with _5 or _45.

My analysis somehow bins the data (here in costheta); I achieve this by corresponding RDataFrame::Filter calls defined in a loop. These binned samples are then extracted (RDataFrame::Take) and processed further on. The bug can be seen when comparing the size of the output vectors, which is plotted on the Y-axis, the aforementioned bins are on the X-axis:

In every bin “multi_chained” does not match single_chained (which is the same chain run single-threaded)

single_chained seems to be correct, as It reproduces a manual sum of the _45 and _5 event classes, which were obtained by the same macro, but with a single tree instead of a chain backing the dataframe). For the individual tree runs, multi-threading does not affect the results (no difference between single_45 and multi_45 or single_5 and multi_5)

I also get an error message on the terminal, which appears several times during the running DataFrame analysis. This happens only in multithreaded chain mode and thus seems to be related to the problem:

Error in TTreeReader::SetEntriesRange(): first entry out of range 0…3375163

The easiest solution for me might be to work around the problem and simply merge both trees beforehand (via TChain->CloneTree()), but this still looks like a bug…

Regards,
Philipp

Hi Philipp,
this indeed looks like a bug, sorry about that! The error messages are the clearest red flag.

If I understand correctly, it should be reproducible when reading TChains containing different trees with different names but coming from the same file, having called EnableImplicitMT.

However this simple tentative reproducer of mine seems to work fine:

#include <ROOT/RDataFrame.hxx>
#include <TChain.h>
#include <iostream>

int main() {
  // write two trees t1 and t2 to the same file
  {
    ROOT::RDataFrame(10)
        .Define("x", [] { return 42; })
        .Snapshot<int>("t1", "f.root", {"x"});
    ROOT::RDF::RSnapshotOptions opts;
    opts.fMode = "update";
    ROOT::RDataFrame(10)
        .Define("x", [] { return 42; })
        .Snapshot<int>("t2", "f.root", {"x"}, opts);
  }

  // single-thread processing of the TChain
  {
    TChain c;
    c.Add("f.root/t1");
    c.Add("f.root/t2");
    ROOT::RDataFrame df(c);
    auto count = df.Count();
    auto sum = df.Sum<int>("x");
    std::cout << "should be 20: " << *count << std::endl;
    std::cout << "should be 840: " << *sum << std::endl;
  }

  // multi-thread processing of the TChain
  {
    ROOT::EnableImplicitMT();
    TChain c;
    c.Add("f.root/t1");
    c.Add("f.root/t2");
    ROOT::RDataFrame df(c);
    auto count = df.Count();
    auto sum = df.Sum<int>("x");
    std::cout << "should be 20: " << *count << std::endl;
    std::cout << "should be 840: " << *sum << std::endl;
  }

  return 0;
}

What’s missing from my repro?

Cheers,
Enrico

I think I have to give up on this. The problem is clearly present in multi-threading mode when I am chaining my main data tree as described previously. Already a *(RDataFrame(my_chain).Count()) delivers a wrong result, i.e. the complicated binning structure is not needed to trigger the problem.

However, I could not create a simple script (expanding from your template) that would reproduce the error. Unfortunately, the problematic data tree contains more or less my entire analysis (involving custom object types, multiple stages of merging the data, etc.) I tried several times, however I was not able to create a “mock tree” that would expose the same behavior.

I am also uncertain if the “different treename, identical filename” aspect is of relevance here. But as I was initially hit by the referenced bug, I considered them to be somehow related.

As expected, the error disappears if I merge the data of the chain into a single tree, which is what I will use for now…

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Hi,
sorry for the late reply, I took some time off work :slight_smile:

I’m glad you have a viable workaround. I would be happy to debug this on the original data and the simplest piece of code that reproduces the problem (in case it’s ok to share the data privately).
Otherwise, best of luck and feel free to open a new topic should you encounter further issues.

Cheers,
Enrico

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.