When using RDataFrame with JITted nodes, I see the memory usage grow by (roughly) 0.7MB per histogram. I saw that this was also discussed in How to delete RDataFrame and clean up memory , and from that thread and the answer in RDataFrame Foreach causing memory leak I gather that this is due to cling keeping the AST in memory (this is also confirmed by profiling: most of the space in use is allocated by llvm and clang symbols), and that it is being worked on. Is there a time estimate, or a jira ticket I could follow? (I’m sorry for opening a new thread for this, but the other ones are closed)
In analysis use cases, filling thousands of histograms in one go is not uncommon, and having a smaller memory footprint makes quite a difference for the turnaround time on a batch system. It is possible to organise analysis code such that not too many histograms are made in one loop, but the lower the practical limit, the more work that becomes, so it does matter whether that is at 2000, 5000, or 10000.
Please let me know if there are any more performance numbers I can provide, or checks that I can do, to help.
Thanks in advance,
ROOT Version: 6.18/04 (through LCG_96bpython3, x86_64-centos7-gcc9-opt)
Platform: CentOS 7
Compiler: GCC 9.2.0