EnableImplicitMT() prevents reading XrootD file with RDataFrame

If I do EnableImplicitMT, RDataFrame is not able to read the input TTree stored in an XrootD-connected file:

ROOT.ROOT.EnableImplicitMT()
df1 = ROOT.RDataFrame(t)
h1 = df1.Histo1D('Sc1_M')
h1.Sumw2()

results in:

Error in <TTreeProcessorMT::Process>: An error occurred while getting tree root://eosuser.cern.ch//eos/user/m/mwilkins/LcLc/data/18/Down/slimmed_0_99.root:/SczTree/DecayTree from file root://eosuser.cern.ch//eos/user/m/mwilkins/LcLc/data/18/Down/slimmed_0_99.root: skipping this file.

while both

ROOT.ROOT.DisableImplicitMT()
df2 = ROOT.RDataFrame(t)
h2 = df2.Histo1D('Sc1_M')
h2.Sumw2()

and

ROOT.ROOT.EnableImplicitMT()
t.Draw('Sc1_M>>h3')
h3 = ROOT.gDirectory.Get('h3')
h3.Sumw2()

produce no errors.


ROOT Version: 6.17/01
Platform: macOS
Compiler: Not Provided


Hi,
thank you for the report, this is nasty.
We routinely read remote files in parallel for testing and benchmarking, so there must be something specific of your setup that breaks multi-thread reads. Would you be able to share a minimal reproducer+data with us?

Cheers,
Enrico

Sure.

Reproducer:

import ROOT

f = ROOT.TFile.Open('root://some/path/ForSharing.root')
t = f.Get('DecayTree')

ROOT.ROOT.EnableImplicitMT()
df1 = ROOT.RDataFrame(t)
h1 = df1.Histo1D('Sc1_M')
h1.Sumw2()

ROOT.ROOT.DisableImplicitMT()
df2 = ROOT.RDataFrame(t)
h2 = df2.Histo1D('Sc1_M')
h2.Sumw2()

ROOT.ROOT.EnableImplicitMT()
t.Draw('Sc1_M>>h3')
h3 = ROOT.gDirectory.Get('h3')
h3.Sumw2()

Data:
ForSharing.root (338.4 KB)
This file contains some events in a cloned TTree of the original.

NB:
I do not observe the behavior using the local copy I’ve attached, only when accessing files via XrootD.

1 Like

Hi @mwilkins,
thank you for the simple reproducer.
This is now ROOT-9948.

As a temporary workaround, note that if you construct RDataFrame as

ROOT.RDataFrame('DecayTree', 'root://some/path/ForSharing.root')

instead of

f = ROOT.TFile.Open('root://some/path/ForSharing.root')
t = f.Get('DecayTree')
df1 = ROOT.RDataFrame(t)

things work correctly even with multi-threading enabled, as the problem is in the logic we have in place to deduce a TTree name in a TFile from a TTree object.

Cheers,
Enrico

You’re welcome. Thank you for your prompt replies.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.