Crash when reading too many branches

For what it’s worth, there seems to be also a dependence on how “similar” the branch names are. For example, if I try reading only two branches with names:

HLT_Mu7_IP4_part0
HLT_Mu7_IP4_part1

I get the crash. While trying with the following branches:

HLT_Mu7_IP4_part0
HLT_Mu8_IP3_part0
HLT_Mu8_IP5_part0
HLT_Mu8_IP6_part0
HLT_Mu9_IP4_part0
HLT_Mu9_IP5_part0
HLT_Mu9_IP6_part0
HLT_Mu12_IP6_part0
HLT_Ele32_WPTight_Gsf

I don’t get the crash.

In the original post, the branches I read are:

nElectron
Electron_pt
Electron_eta
Electron_deltaEtaSC
Electron_phi
Electron_charge
Electron_dxy
Electron_dz

(same set for Muon_* without the deltaEtaSC)

[
genWeight
LHEWeight_originalXWGTUP
]

The last two branches within [] are the ones I added to trigger the crash. For now I do not have any explanation for this strange behavior.

Keeping the read branch names “clearly different”, I can go up to reading 25 branches without crashes. However, I don’t know the criteria for “clearly different”, since there have been some names I thought are “clearly different” yet it still crashed…

In the code, I don’t perform any string manipulation except duplicate checking before storing them.

If you play with strings and if you use “Form(...)” anywhere, replace it with “TString::Format(...)”.

No, I do not have any instances of Form() or Format() in the code.

Or TString for that matter, I use the std::string class and call c_str() for ROOT methods.

The best is to run the failing example under valgrind --suppressios=$ROOTSYS/etc/valgrind-root.supp

Turns out the issue has nothing to do with branch names and counts, but instead has to do with me capturing the TBranch in uptrs.

I wonder why the uptr ran fine with low branch counts or dissimilar names then.

It looks like you are using a TChain. The TBranch object are deleted (and created slightly differently) at each file boundary. To capture a TBranch address you need to use:

  chain->SetBranchAddress( branchname, &objectptr, &branchptr);

and branchptr will be updated with the address of the TBranch object at each file boundary.

This I didn’t know about. Does this mean I need to call SetBranchAddress at every file change or is this automatically updated?

Somewhat related, I perform GetEntry calls from the TBranch pointers (at which point the code no longer knows about the source TChain/Tree), is this in any way affected by the above?

And just to confirm, the above capture will also work with TTree, correct?

This I didn’t know about. Does this mean I need to call SetBranchAddress at every file change or is this automatically updated?

The pointer value is automatically updated.

And just to confirm, the above capture will also work with TTree , correct?

yes :slight_smile:

Somewhat related, I perform GetEntry calls from the TBranch pointers (at which point the code no longer knows about the source TChain / Tree ), is this in any way affected by the above?

TBranch::GetEntry require the entry number within the current TTree. To work correctly you need to pass the return value of LoadTree.

for(Long64_t overallEntry = 0; overallEntry < chainOrTree->GetEntriesFast(); ++overallEntry) {
   Long64_t localEntry = chainOrTree->LoadTree(overallEntry); // for a TTree this is a nop and returns overallEntry.
   branch->GetEntry(localEntry);

Cheers,
Philippe.

Ok, let me see if I understood correctly: In both, tree can be either TTree or TChain

Current way:

branch = tree->GetBranch("name");
branch->SetStatus(1);
branch->SetAutoDelete(false);
branch->SetAddress(&obj);

[... much later ...]
branch->GetEntry(n);

is not correct in that it’ll give me problems when the file is changed. Instead I should do:

tree->SetBranchStatus("name", 1);
tree->SetBranchAddress("name", &obj, &branch);
branch->SetAutoDelete(false);

[... much later ...]
branch->GetEntry(n);

Is this correct?

edit: seeing @pcanal’s reply I see there’s also a problem with how I handle the n. This may be an issue, but ok fine I’ll just have the method see the source TChain too.

Confirming that when adding the second file, only the proper method we discussed here does not crash.

Marking as solved, thanks for the assistance!

I found out that there is a related problem. Much later in the code, I read the branches starting from single branches, some of which are counters. Since I read array branches into vectors, I also check that the read counter does not exceed the vector capacity, and if it does, resize the vector as necessary. Of course this means I need to reset the address, since vector.data() has now changed.

However in this case I ran into the same issue in that the branch->SetAddress() is not fully compatible with TChain. I tried updating it to the tree->SetBranchAddress("name", &obj, &branch) syntax, where tree = counter_branch->GetTree() but this does not fix the issue, presumably because GetTree() gives the current tree rather than the original TChain.

Is this understanding correct, and is there a way for me to get the original TChain so I can update its address table, without explicitly introducing the TChain pointer in the block where GetEntry() and vector layout happen?

presumably because GetTree() gives the current tree rather than the original TChain.

Correct.

without explicitly introducing the TChain pointer in the block where GetEntry() and vector layout happen?

I am afraid not, however …

What kind of branches do you have (i.e. can you delegate the resizing to ROOT)?

For a branch that hold a variable you can call GetMaximum on the BranchCount branch (or LeafCount branch) to get the maximum size of the collection. So the test and resizing needs to be done only once per file.

You can either be notified (by attaching an object with a Notify function to the chain via SetNotify) when a file change or you can detect the switch over (check GetTreeNumber() value after a call to LoadTree).

Cheers,
Philippe.

They are floats, ints - plain old data in the informal sense - and arrays of those.

It’s possible to delegate the resizing, but this probably requires breaking the current structure of the code, which I would rather not do.

That said, if I can really avoid running the test every event, this may be a good optimization opportunity. However it’s not obvious how to do this yet, since it seems that I still need to check the result of GetTreeNumber() every event, which in the case of no resizing, is exactly the same cost of one if statement per event.

As for Notify(), I don’t have a good feel of how it works so I can’t say what the cost will be.

class Notifier : public TObject {
private: 
    TChain *fChain;
    .... *fThingForResizing;
    Notifier(TChain *chain, ... ) : fChain(chain), fThingForResizing(....) {}
    void Notify() {
           fThingForResizing->ResizeStuff(fChain); // or whatever code to do the resizing
    }
};

and then

Notifier notifier( chain, .... ); // This needs to have the same lifetime as the TChain object.
....
chain->SetNotify(&notifier);

And at each file start, Notifier::Notify will be called.

Hmm. Do I understand correctly that beyond the fact that the Notifier class needs to inherit from TObject, and that it needs to have the void Notify() signature member method, the layout is arbitrary and that together with the chain->SetNotifier(&notifier) call, this allows me to execute arbitrary code when, and only when, each time the file in the chain changes?

That’s… cool. I’ll test it out.

Very interestingly, the Notify() approach is slower by far:

Processing 9945000 events...

^C
real	20m17.473s
user	19m54.261s
sys	0m18.491s

vs manually resizing in each event, when the counter exceeds the current capacity.

Processing 9945000 events...
Processed 9945000 events!
756406 pass trigger
160573 also contain emu pair

real	2m35.967s
user	2m33.073s
sys	0m0.930s

Perhaps due to the cost of TTree::GetMaximum()? Either way, premature optimization is probably not a good idea, so I’ll settle with the current performance and fill up the rest of the framework, now that the core issue of the thread is solved.

Thanks again for the assistance.

Yes :slight_smile: … this reads/scan the whole TTree.

For a give branch that correspond to array, you can call GetBranchCount() and then get its leaf and call GetMaximum on the leaf.
For example, this might work for you:

TLeaf *leafcount = fChain->GetLeaf(name_of_index);
// and use leafcount->GetMaximum();

Cheers,
Philippe.

Aha now it’s a lot better. With the 10M events example the runtime of the TLeaf method is roughly equal to the original naive implementation.

I’ll probably be sticking with this as it allows more flexibility, and FWIW also somewhat more modular.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.