I have noticed consistent problems using TChain to process many files with large N-tuples or large numbers of histograms. The memory used by ROOT appears to increase “without limit” as more and more files get procssed, until either the application is killed or my workstation (Linux) crashes.
I know that I can deal with this manually by writing my own manual loop over files, opening and closing each one individually and explicitly clearing out memory between each one. That seems to fundamentally defeat the purpose of TChain, however.
I would like to avoid this memory explosion, while continuing to reap the benefits of TChain’s ability to process many files. In particular, I want to use the TChain::Process() function as an event-loop Framework for data analysis, filtering, and histogram filling and fitting. Can someone give me some guidance on how to do this most effectively?
Are the special Option_t strings I can specify to encourage ROOT to do the necessary memory management at the end of each processed file? Are there options or configuration functions I should call on TChain itself? Are there settings at the gSystem level I can use for this purpose?
TTree::Process does not leak memory (as far as we know) on its own.
However your data and/or objects stored in the TTree might.
You might need to do some clean-up at the beginning or end of your Process method. If you provide more information about your specific case, we might be able to spot the issue.
[quote=“pcanal”]TTree::Process does not leak memory (as far as we know) on its own. However your data and/or objects stored in the TTree might.
You might need to do some clean-up at the beginning or end of your Process method. If you provide more information about your specific case, we might be able to spot the issue.
[/quote]
I’ll work on constructing a simple case. I don’t think the problem is with Process() per se. What we’ve observed is that if we set up a TChain with multiple files, then in any activity where we implicitly loop over the entire contents, the executable slowly but surely occupies more and more memory. If we reduce the number of files referenced in the chain, that growth is reduced proportionately.
In particular, we have found that one definite workaround is to skip the whole TChain system entirely. Instead, we can set up a manual loop over a list of filenames. For each one, we open a TFile, process the TTree in that file, and explicitly close and delete the TFile at the end of each iteration. If we do this, the executable’s memory allocation remains more or less constant.
I guess what I’m asking is whether there is a way to configure a TChain so that it will do that file closing/deleting automatically. Or if there is some other configuration we should be doing to the TChain so that it doesn’t (I guess) leave files open as it goes along.
[quote]I guess what I’m asking is whether there is a way to configure a TChain so that it will do that file closing/deleting automatically. Or if there is some other configuration we should be doing to the TChain so that it doesn’t (I guess) leave files open as it goes along.[/quote]What you are asking is indeed already the current behavior. The files are closed before going on to the next!. Something more complex is going on.
Oh! Thank you very much, Philippe. I will follow up within BaBar based on your reply, and see if we can’t isolate a simple example for further troubleshooting. It sounds like we ought not to be having problems.