"Too many files open" After re-opening same TFile

Dear ROOTers,

I have an application that looks at a TFile that is being updated by a different process. I open the file using
f = new TFile(filename,“READ”)

look at the last entry in the the TTree in the file, then do
f->Close();
delete f;

before I finish with it.
The file itself is a symlink to a ROOT file that is both being regularly updated and replaced by our DAQ monitoring system.

However, after doing this repeatedly for a week, the job crashes with “too many open files”, failing to open file.
Some other symptoms:

  • looking through gROOT->GetListOfFiles() shows only one copy of the open file
  • looking at lsof shows many many copies of that file still open (i.e. assigned a file descriptor that was not closed).

Is there some way in which ROOT may be hanging onto the file descriptor even after the TFile has been closed?

–Nathaniel

  1. Do you have any inter-process synchronization to ensure that only one process can access your file?
    If not - why should it work at all? What should one ROOT (in your app) do if another … hm, ROOT (in your
    DAQ) replaces the file while the first (your app) is reading this file? I’m not sure if ROOT gives any guarantee in such a case.

  2. Can you reproduce your problem in a simple ROOT macro if you, say, open/close the same file in a endless
    loop?

I think I’ve solved the problem:
At certain times, the file was unopenable, and so I got a TFile pointer to a “zombie” file, which I wasn’t deleting properly on failure.

As to the questions above:

  • Most times, the file is open and being written-to by the DAQ process with AutoSave on the TTree. Processes both reading and writing files work OK in ROOT, providing that a certain seqno is not deleted out from under the other.
  • The problem came during file open and close, as you point out: when the file symlink was pointing at an uninitialized file, the Open directive would fail to Zombie.

Thanks for the thoughts!

—Nathaniel