Close files opened in TChains explicitly

Hi,

I am building a number of histograms that I dump in a final root file, but in order to fill these histograms, I need to open several input files. These input files represent different physics processes, and for each process I have systematic variations. My code treats one physics process, one systematic variation at a time, so the number of files I need to process at once is quite manageable. However, the python script I use loops over all these physics processes and their systematic variations, so total, there are a lot of files being run on by the same script. I eventually run into:

SysError in <TFile::TFile>: file /global/common/higgstautau/ntuples/lephadTotal/LHProcessor.PythiaWH100_tautaulh.mc11c.root can not be opened for reading (Too many open files)

I use a wrapper python class to load the samples (TreeLoader), which I attached. I explicitly destroy the TChain in the destructor, and in the main script I use the ‘del’ statement to remove the TreeLoader and the TChain it outputs, after each physics process/systematic. For some reason, it is not enough to close the files.

Is destroying a TChain enough to close the files it opened? Do you have any suggestions to make sure I close these files?
TreeLoader.py (2.49 KB)

Hi,

Yes deleting the TChain C++ object is enough to close the underlying files. However I don’t know if what you describe is enough to release the references enough for PyROOT to delete the underlying C++ object.

Cheers,
Philippe.

Also the code: tcTest = TChain(data.treeNameTest) tcTest.Add(newPath) self._tree.Add(tcTest)is somewhat unusual. Why not simply: self._tree = TChain(data.treeNameTest) self._tree.Add(newPath)

Hi,

‘del’ only removes a single reference count, and is completely superfluous here. In the code you sent, there’s a getTree() method that can return self._tree. If there’s still a reference outstanding from calling that method, it’ll keep the TChain alive (you can look with gc.get_referrers() from module gc at self._tree in the del of TreeLoader to find any remaining outstanding referrers).

I’m not sure whether there’s another way to close the files more explicitly. At least I couldn’t find it.

Cheers,
Wim

Hi,

A call to TChain::Reset would also close the file (but additional forget all about any settings … like the list of files).

Cheers,
Philippe.

Hi guys,

Thanks a lot for the help, it’s been very helpful in the end. Turns out there was something fishy going in with

tcTest = TChain(data.treeNameTest) tcTest.Add(newPath) self._tree.Add(tcTest)

The reason I coded it this way is because I store testing and training events in different trees in my files, and in some circumstances I want to concatenate both trees together, something the TChain doesn’t allow me to do easily. In any case, it was the tcTest and tcTrain TChains that weren’t being deleted. So I just did:

tcTest = TChain(data.treeNameTest) tcTest.Add(newPath) self._tree.Add(tcTest) tcTest.Reset()

and it fixed it.

Thanks again!
Michel.