Dear All,
I’m using RDataFrame to process a large number of files (~1000) with a large number of events per file (basically an entire MC campaign). During processing, a constant, rapid growth of cache memory consumption can be observed. To give you an example, consider the following output that corresponds to the memory configuration prior to the execution of the program
total used free shared buff/cache available
Mem: 251G 4.0G 247G 18M 162M 247G
Swap: 0B 0B 0B
Just a couple of minutes later, the configuration changes dramatically (MT beign disabled)
total used free shared buff/cache available
Mem: 251G 4.1G 41G 18M 206G 246G
Swap: 0B 0B 0B
This continues till the cache consumes all memory which will kill the connection to our Institut’s server. I have no idea why this happens. I’ve read in this forum that the compression level of the branches in the input ROOT files plays a role. Therefore, I took a look at th tree and saw that some branches have a compression level of more than 50! Those files are not generated by myself, so there’s nothing I can do about it. But, even excluding those columns when initializing the RDF does not get rid of the problem.
As you can see, I’m desperate. Any help is highly appreciated :).
Best regards,
Christof
_ROOT Version: 6.20/06
_Platform: x86_64-centos7
_Compiler: gcc8