Large memory leak retrieving data from many files

Hi,

I’m observing a huge memory leak in PyROOT trying to retrieve many objects (RooStats::HypoTestInverterResults and RooFitResults) from many different files.

The following snippet is problematic:

194     for wsId in wsNameMap:
195         name = wsNameMap[wsId]
196
197         #ht = GetHypoTestResultFromFile(f, name)
198         ht = f.Get(name)
199         if ht is None:
200             continue
201
202         #del ht
203         continue

wsNameMap simply contains a unique string to a name of an object in the open file. This map is made with filtering f.GetListOfKeys(), and does not cause the leak: simply starting the loop wth the continue statement keeps memory usage constant. That leads be to believe that I have no problems with unclosed files. (The statement f.Close() is in fact right outside this loop, after some checks on the HypoTestResult and the RooFitResult are done.)

When executing this snippet on 100 inputs files, memory leaks. For good measure I loop over every file 50 times, and this leaks MB’s by the second.

The commented line in 202 does not help. The utility function GetHypoTestResultFromFile is implemented like this:

323 def GetHypoTestResultFromFile(f, name):
324     return GetObjectFromFile(f, name, "RooStats::HypoTestInverterResult")
325
326 def GetFitResultFromFile(f, name):
327     return GetObjectFromFile(f, name, "RooFitResult")
328
329 def GetObjectFromFile(f, name, type):
330     if f.IsZombie():
331         return None
332
333     result = f.Get(name)
334     if not result or result is None or result.ClassName() != type:
335         if result:
336             print result.ClassName()
337         print "Cannot open {1} {0}".format(name, type)
338         return None
339
340     ROOT.SetOwnership(result, True )
341     return result

The leaks are observed both with and without line 340. Also switching between the strict and heuristic memory management does not help. I’m at a loss why this kind of programming should cause a leak - it’s not unusual to read O(50k) RooFit results and hypothesis tests. Any help would be greatly appreciated.

In case it is relevant: my ROOT build is 6.04.02-x86_64-slc6-gcc48-opt, using Python 2.7.4.

Cheers,
Geert-Jan

Only one HypoTestInverterResults ctor sets ownership on its internal lists. Default (used by I/O) and copy ctor do not.

-Dom

Who does own them then? Closing the file does not invalidate the pointers, explicitely deleting the file pointer does not invalidate them, asking for ROOT.SetOwnership(result, True) does not give the HypoTestInverterResult its internal lists, and using SetDirectory doesn’t matter, as I’m keeping the file open while reading its output. Do you have a snippet of code that would demonstrate a non-leaking use of HypoTestInverterResult?

Is the same thing true about a RooFitResult? Just repeatedly reading that from a (set of) file(s) in similar fashion also leaks.

Edit @ 14:28:

In the meantime I have done some more experiments to see how I can avoid this problem. So far I have not been able to, and it is a huge showstopper. Frankly, it’s crazy that a loop like this:

f = ROOT.TFile("foo.root", "READ")
for n in names:
    hypotest = f.Get(name)
f.Close()    

should EVER leak memory. It is not unreasonable to have O(500) upper limit results (i.e., HypoTestInverterResult) in O(10) files (think of a SUSY analysis with 10 signal regions and 500 model points). With the default 20 HypoTestResult in the HypoTestInverterResult TList, this leaks a whopping 100000 HypoTestResult objects. This should not be the default behaviour in my opinion, or if it is, there should be a very clear example on how to avoid this. It is crazy that a HypoTestInverterResult is not self-contained when constructed from a file.

The only solution is running many scripts (so that 1 process just opens 1 file), but the overhead is about 2 seconds to do “import ROOT” and load a library. With many files, that is simple not feasable: it should not take two hours to loop over several thousand very small files!

No-one. Am saying is bug in HypoTestInverterResult.

Don’t know. Has custom cctor, but default assignment; can’t be good. Also has custom streamer, so actual behaviour in your case depends on what versions persistent/transient.

Argue to write small test in .C. If leaks, file bug report with RooStats devs. If no leak, file bug report for PyROOT devs.

-Dom

In the equivalent code in C++ there is definitely also a leak. I’ll try to patch the class and submit it.