Memory leak in Factory::TrainAllMethods?

a.bragagnolo · June 11, 2019, 12:27pm

Hi ROOT experts,
I’m noticing a memory leak in Factory::TrainAllMethods. I do not know if it is intended or it is a bug.

I’m experiencing this while training several methods with a large dataset, inside the same factory.

Practically at every training iteration (in particular when the output prints “Prepaying the XXXX transformation…”) I notice a sizable bump in memory usage (in my case around 1 gb). It seems to me that at every iteration the entire dataset is (re)loaded into memory (without deleting the previous iterations). This quickly leads to memory saturation and to the process to be killed if to much method have been trained.

Is it necessary that TMVA keeps all the datasets loaded at the same time? Even if they are the same data?

Best,
Alberto

Axel · June 12, 2019, 1:16pm

@moneta could you have a look, please?

kialbert · June 14, 2019, 10:27am

Hi Alberto,

Could you give us some more details on your particular setup when you experience this problem? E.g. does the problem show up only for specific methods or for all methods? Do you use Transformations on the 1) Dataset 2) methods?

If possible, a minimal reproducer would be great, using a dummy dataset and maybe 2 methods, that showcase the problem.

Cheers,
Kim

a.bragagnolo · June 14, 2019, 10:49am

Hi Kim,
the problem shows up at every method, but all the methods are pyKeras ones.
I use the transformation [N,G,N] for every variable and for every method.

A minimal reproducer is hard to cook up since the issue it is noticeable only for quite big datasets.

Alberto.

kialbert · June 14, 2019, 11:21am

Hi,

That is very helpful. We can easily produce a synthetic dataset of lets say 100 mb in memory using dalaloader.AddEvent(...) and then just book 10 methods (or more).

Thanks,
Kim