Possible memory leak writing TTrees?

jwruss · March 2, 2020, 10:30pm

I run the attached macro using as input files containing TTrees of variable length. As the length of the trees become larger the more memory they use. If they get too large, it appears oom_killer issues a SIGKILL and then I try to resume the macro’s job. The way I attempt to resume the job can be seen between lines 248-252, and 260-296. If the memory leak is related to the growing size of the TTrees as they are being written, a line in addition to line 263,

sampleKLCoeffsTree -> SetAutoSave(numEntries / 100);  //  AutoSave the tree at every 1% completion.

That I should be adding? I’m not entirely sure of the usage, but would adding the line

sampleKLCoeffsTree -> SetAutoFlush(numEntries / 100);

after line 263 be applicable here in resolving the issue?

Attempting to pinpoint the source of the leak hasn’t been successful yet using gdb, but if anyone could point out an obvious cause in the macro I would appreciate it!

_ROOT Version:6.18/00
_Platform: CentOS Linux 7
_Compiler:gcc4.8

jwruss · March 2, 2020, 10:58pm

Another feature that could be applicable here, if it’s the size of TTree that’s the problem as it’s being filled, could be TTree::SaveSelf()? But I haven’t used that feature before so I don’t know.

amadio · March 3, 2020, 7:16am

It does not seem to be a memory leak, but you are accumulating the data to write in memory before writing to disk. How big is your file? Setting auto-flush should help with that as you pointed out.

jwruss · March 3, 2020, 6:25pm

Thanks for the reply, amadio. I create files based on classifications, referencing files that contain 377625 entries. With one of these classification-sorted files containing 8041 entries at size 6.6M, and another containing 10683 entries at size 8.8M, let’s say that works out to being roughly 821 bytes an entry, so at most one of these files will be 310M?

So if I do use SetAutoFlush(), can I do it as described in my first post? It won’t be a problem having both SetAutoFlush() and SetAutoSave() occurring at the same numEntries / 100?

One other thing that I would like clarified: On the documentation for the TTree class, it says under TTree::SetAutoSave() that

When filling the Tree the branch buffers as well as the Tree header will be flushed to disk when the watermark is reached.

Does that mean

sampleKLCoeffsTree -> SetAutoSave(numEntries / 100);

should implicitly set

sampleKLCoeffsTree -> SetAutoFlush(numEntries / 100);

Or does TTree::SetAutoFlush() do something additional here?

pcanal · March 3, 2020, 9:52pm

This is fine.

Or does TTree::SetAutoFlush() do something additional here?

Logically you would want the value of AutoFlush to be strictly less than AutoSave and AutoSave better yet a multiple of AutoFlush.

AutoSave control how often the meta data is stored (the changing part is the location of the baskets).

AutoFlush control how often the data is “flushed” to disk. This flush is the ‘finalization’ of a Cluster meaning that after an AutoFlush all the data for an integral number of entries in on disk. [Because some branches have a much bigger data load than other it can happen that a branch will write some of its current entries/data to disk before the next AutoFlush).

If the memory leak is related to the growing size of the TTrees

Are you sure this is the case? The default for TTree is to get in memory at most the amount of memory need to store the uncompressed version of 32MB of data on file (so if your average compression factor of 4, this would be around 128MB), there is also a few incidental in the meta-data that shouldn’t be noticeable until you reach the order of many millions of entries.

Attempting to pinpoint the source of the leak hasn’t been successful yet using gdb ,

This is not the best tools to detect leaks. You need to use a memory profiler (for example the ‘massif’ tool provided by valgrind).

Cheers,
Philippe.

jwruss · March 3, 2020, 10:36pm

Thank you for your response, Philippe. I find your explanation of AutoSave in comparison to AutoFlush to be very informative.

Your question about if the memory leak really relates to growing TTree size and amadio’s question about the resulting file size has also been helpful. Given the maximum size at which I expect my files to be produced at, it does seem like leak would be elsewhere in the macro.

I have started looking into using valgrind. I have modified what Axel has under “Threading checks” in his post, including also the flag --leak-check=full as described here. Something that I would like clarified, though, is if this flag includes the output corresponding to when the flag --tool=massif is set? It doesn’t look like I can set both these flags simultaneously?

pcanal · March 4, 2020, 8:42pm

Indeed --leak-check is an option of the tool “Memcheck”. This would let you see the data that is “really” leaked (i.e. never deleted). The massif tool would instead show you where the memory is used (at a series of point in time), it will be allow to detect case where something grows during the process lifetime but is deleted right before the end (memory hoarding).

jwruss · March 8, 2020, 7:50am

It has been slow going trying to find relevant leaks with so far, but I just thought of something while valgrind processes. Could lines 51-52, 142-143 be problematic? The functions they’re in call the same TTrees repeatedly inside of do while loop between lines 350-356. Perhaps at the end of these two functions containing lines 51-52, 142-143, I should do

KLBasisTree -> ResetBranchAddresses();

or

KLBasisTree -> Reset();

?

I haven’t had to use these functions before, which is why I ask here if this is how they should be used?

Wile_E_Coyote · March 8, 2020, 8:47am

For any piece of code in form:

some_class *some_pointer = 0; // 0 ... or ... new some_class()
some_tree->SetBranchAddress("some_branch", &some_pointer);

you, in the end, need to execute:

some_tree->ResetBranchAddresses(); // disconnect from local variables
delete some_pointer; // cleanup

or simply:

delete some_tree; // no longer used
delete some_pointer; // cleanup