Home | News | Documentation | Download

Error in <TCollectionLessSTLReader::GetCP()>: Read error in TBranchProxy

I have the following code example, which produces me tons of errors:
Error in <TCollectionLessSTLReader::GetCP()>: Read error in TBranchProxy
They begin to pop up at some random event in the middle…

What can be a reason?

import ROOT
ROOT.EnableImplicitMT()

ch = ROOT.TChain("lumical")

ch.Add("/nfs/dust/ilc/user/dudarboh/final_files/FCAL/tb16/e_FTFP_BERT.root")
ch.AddFriend("emv=lumical", "/nfs/dust/ilc/user/dudarboh/final_files/FCAL/tb16/e_FTFP_BERT_EMV.root")
ch.AddFriend("emx=lumical", "/nfs/dust/ilc/user/dudarboh/final_files/FCAL/tb16/e_FTFP_BERT_EMX.root")
ch.AddFriend("emy=lumical", "/nfs/dust/ilc/user/dudarboh/final_files/FCAL/tb16/e_FTFP_BERT_EMY.root")
ch.AddFriend("emz=lumical", "/nfs/dust/ilc/user/dudarboh/final_files/FCAL/tb16/e_FTFP_BERT_EMZ.root")

df = ROOT.RDataFrame(ch).Filter(" rdfentry_ < 9340000").Filter("if (rdfentry_ % 500000 == 0){cout<<rdfentry_<<endl;} return true;")

histos = [ROOT.TH1D(), ROOT.TH1D(), ROOT.TH1D(), ROOT.TH1D(), ROOT.TH1D()]


df = df.Define("y1_default", "mc_cont_posy[layer == 0 && mc_cont_momz < 0.]")
histos[0] = df.Histo1D(("h_default", "default", 32, -90, 35), "y1_default")
for i, t in enumerate(titles[1:]):
    df = df.Define("y1_{}".format(t), "{0}.mc_cont_posy[{0}.layer == 0 && {0}.mc_cont_momz < 0.]".format(t) )
    histos[i+1] = df.Histo1D(("h_{}".format(t), "{}".format(t), 32, -90, 35), "y1_{}".format(t) )

Hi,
there might be some issue with these particular files, can you share one that reproduces the issue? @pcanal do you have any idea what might cause that error?

Cheers,
Enrico

I discover that it runs smoother with less threads… e.g.
ROOT.EnableImplicitMT(5) gives an error around 2 mil event
ROOT.EnableImplicitMT(4) still gives an error but around 4 mil event…
ROOT.EnableImplicitMT(3) still gives an error but around 8 mil event…
ROOT.EnableImplicitMT(2) still gives an error but around 9 mil event…
# ROOT.EnableImplicitMT() makes it to the end

and it feels like time complexity is increasing with event number…

Could it be something to do with file access by different threads?

It certainly looks related. Can you please open an issue at https://github.com/root-project/root/issues providing some way for us to reproduce the problem?

Cheers,
Enrico

1 Like

Also error does not happen in this variation of the code without Friends:

import ROOT
ROOT.EnableImplicitMT()

df = ROOT.RDataFrame("lumical", "/nfs/dust/ilc/user/dudarboh/final_files/FCAL/tb16/e_FTFP_BERT.root").Filter(" rdfentry_ < 9340000").Filter("if (rdfentry_ % 500000 == 0){cout<<rdfentry_<<endl;} return true;")
df_emv = ROOT.RDataFrame("lumical", "/nfs/dust/ilc/user/dudarboh/final_files/FCAL/tb16/e_FTFP_BERT_EMV.root").Filter(" rdfentry_ < 9340000").Filter("if (rdfentry_ % 500000 == 0){cout<<rdfentry_<<endl;} return true;")
df_emx = ROOT.RDataFrame("lumical", "/nfs/dust/ilc/user/dudarboh/final_files/FCAL/tb16/e_FTFP_BERT_EMX.root").Filter(" rdfentry_ < 9340000").Filter("if (rdfentry_ % 500000 == 0){cout<<rdfentry_<<endl;} return true;")
df_emy = ROOT.RDataFrame("lumical", "/nfs/dust/ilc/user/dudarboh/final_files/FCAL/tb16/e_FTFP_BERT_EMY.root").Filter(" rdfentry_ < 9340000").Filter("if (rdfentry_ % 500000 == 0){cout<<rdfentry_<<endl;} return true;")
df_emz = ROOT.RDataFrame("lumical", "/nfs/dust/ilc/user/dudarboh/final_files/FCAL/tb16/e_FTFP_BERT_EMZ.root").Filter(" rdfentry_ < 9340000").Filter("if (rdfentry_ % 500000 == 0){cout<<rdfentry_<<endl;} return true;")

data = [df, df_emv, df_emx, df_emy, df_emz]
titles = ["default", "emv", "emx", "emy", "emz"]
scales = [0., 0., 0., 0., 0.]
histos = [ROOT.TH1D(), ROOT.TH1D(), ROOT.TH1D(), ROOT.TH1D(), ROOT.TH1D()]
colors = [ROOT.kBlack, ROOT.kRed-8, ROOT.kRed+2, ROOT.kGreen+3, ROOT.kBlue]
canvas = ROOT.TCanvas()

# for i, d in enumerate(data):
    # print("n events in", titles[i], d.Count().GetValue() )

for i, (d, t) in enumerate( zip(data, titles) ):
    scales[i] = d.Count()
    d = d.Define("pad1", "mc_cont_posy[layer == 0 && mc_cont_momz < 0.]")
    histos[i] = d.Histo1D(("h_{}".format(t), "{}".format(t), 200, -90, 35),"pad1")

I will try to make independent reproducible and make an issue

1 Like

@eguiraud I am having some troubles…

When I try to create TTrees with RDataFrame and add them as friends to the chain I encounter:

Error in <AddFriend>: Tree 'test1' has the kEntriesReshuffled bit set, and cannot be used as friend nor can be added as a friend unless the main tree has a TTreeIndex on the friend tree 'test2'. You can also unset the bit manually if you know what you are doing.

I tried something like:

import ROOT
ROOT.EnableImplicitMT()

ROOT.RDataFrame(10000000).Define("x", "gRandom->Rndm()").Snapshot("test1", "test1.root");
ROOT.RDataFrame(10000000).Define("x", "gRandom->Rndm()").Snapshot("test2", "test2.root");

f1 = ROOT.TFile("test1.root")
f2 = ROOT.TFile("test2.root")

t1 = f1.Get("test1")
t2 = f2.Get("test2")

t1.ResetBit(ROOT.TTree.EStatusBits.kEntriesReshuffled)
t2.ResetBit(ROOT.TTree.EStatusBits.kEntriesReshuffled)

ch = ROOT.TChain("test1")
ch.Add("test1.root")
ch.AddFriend("fr=test2", "test2.root")

df = ROOT.RDataFrame(ch).Filter(" rdfentry_ < 999999").Filter("if (rdfentry_ % 500000 == 0){cout<<rdfentry_<<endl;} return true;")

df = df.Define("y", "x*fr.x")
h = df.Histo1D(("h", "", 100, -10, 10), "y")
h.Draw()

but it fails… Any advice on how to compactly add Friends trees produces by the RDataFrame?

I tried to see this, but I find it hard to make useful in my case

I think you’ll have to do this without EnableImplicitMT :confused:

To save time you can also Snapshot just one file and then cp it.

I have tried to make a reproducer, but I failed…

Maybe the reason code above fails, because files are quite large by itself… 34-47 GB… While I tried for an example only 3 GB files at max, at the moment… Maybe I am missing something, maybe I don’t… But I think this one is quite hard to catch…

Is there any way I can check what root does internally, while executing my code? Maybe this would help to track it

You can try setting ROOT.gDebug to a high-enough value but this is a problem in TTree/TTreeReader internals, we need either a genius insight by @pcanal or a way to reproduce and investigate on our side. Are the files private? If yes, could you maybe be allowed to duplicate just 1% of that data N times such that the dataset size is equal to the original (but the physics content is all redundant), so that the data can then be shared with us? (or maybe the data could simply be shared with me privately under the agreement that I don’t discover any new physics with it? :grinning_face_with_smiling_eyes: )

Cheers,
Enrico

There is no “private” issue with sharing the data, but more of a “size” issue, so I didn’t know where to put it.

But I recently realized I can removed all the unused columns except one and issue is still persists.

So now size of the files down to the total ~8GB.

I just need ~1 hour to upload it to the cloud and there will be an issue on the github with links to the files

cheers,
Bohdan

1 Like

Moved to the: