Naive question about garbage collection

Hi,

I’ve been having some problems with pyROOT using up lots of memory and crashing. After lot’s of investigation, it seems to be the fact that ROOT does not delete objects from memory when they go out of scope in python. I’ve reduced it to a toy problem:

[code]import ROOT as R

rf = R.TFile(“histograms.root”)

for key in rf.GetListOfKeys():
hist = rf.Get(key.GetName())[/code]

“histograms.root” contains a large number of histograms (about 5000). When I run this code it gobbles up about 2 GB of memory, which does not get released until python terminates. My (somewhat patchy) understanding of python memory management is that when I re-assign the variable “hist” by reading the next histogram, the reference count to the old histogram is decremented by one, and since there should only have been one reference to it, it should then be deleted. My understanding is that then ROOT calls the “Delete” function on the histogram “behind the scenes”.

I tried explicitly decreasing the reference count to “hist” at the end of the loop:

for key in rf.GetListOfKeys():
    hist = rf.Get(key.GetName())
    del hist

Same thing happens - 2GB of memory used.

Next I tried explicitly calling “Delete()” on the histogram:

for key in rf.GetListOfKeys():
    hist = rf.Get(key.GetName())
    hist.Delete()

Now running the script only uses about 60MB of memory!

I imagine this is something to do with me misunderstanding memory management in python. I thought it should never be necessary to call “Delete()” on ROOT objects manually, but here it seems to be the only way to prevent crazy memory usage. Any suggestions?

Try using:

Works for me.

Hi,

arguably that should be:ROOT.SetOwnership(your_object, True) as python does not take ownership by default of objects returned from C++ (unless they are returned by a constructor). This may work as well, but should be used with more care:rf.Get._creates = True
Cheers,
Wim

1 Like

Hi,

Wim’s solution prevents the large memory usage, and makes more sense. I did not realise previously that objects returned by c++ functions other than constructors were not owned by python, this all makes a lot more sense now!

Could I ask what the disadvantages or dangers of using

would be?

Thanks for the answers!

Nick

You are right, sorry for the misleading comment.

Hi,

the danger in setting _creates is that it is set on all TFile.Get, so any other TFile object that is opened and has Get() called on it, will from there return objects now owned by python. May or may not be what you want.

Cheers,
Wim