Hello @FoxWise!
I cannot reproduce the problem with the latest ROOT:
import ROOT
import os, psutil
import gc
import numpy as np
process = psutil.Process()
def print_mem():
gc.collect()
print(process.memory_info().rss) # in kbytes
ROOT.gInterpreter.Declare("""
auto create_rvec(unsigned int n) {
//return std::array<unsigned int, 3>({n, n, n});
return ROOT::RVec<unsigned int>({n, n, n});
}
""")
df = ROOT.ROOT.RDataFrame(1000).Define("my_rvecs", "create_rvec(rdfentry_)")
# To trigger the event loop before measuring memory
full_array = df.AsNumpy(["my_rvecs"])
print_mem()
for d in full_array["my_rvecs"]:
d = np.asarray(d)
print_mem()
Output:
449548288
449548288
There is no increase in memory when iterating over the RVecs.
Are you sure you’re measuring the memory usage correctly, and there are not garbage collection effects? Which ROOT version are you using?
However, the first example with an RVec in a loop indeed shows a memory increase also for me:
import ROOT
import os, psutil
import gc
import numpy as np
import matplotlib.pyplot as plt
import tqdm
process = psutil.Process()
# To trigger initialization outside the loop. When first instantiating a given
# class PyROOT caches many things that we don't want to measure.
ROOT.RVec('double')()
def get_mem():
gc.collect()
return process.memory_info().rss # in kbytes
n_iter = 40000
times = np.empty(n_iter, dtype=float)
for i in tqdm.tqdm(range(n_iter)):
times[i] = get_mem() * 1e-6
ROOT.RVec['double']()
plt.figure()
plt.plot(times)
plt.xlabel("iteration")
plt.ylabel("rss [MB]")
plt.savefig("plot.png")
You can see indeed a jump in memory, but it doesn’t increase linearly with the number of created RVecs:
![plot](https://root-forum.cern.ch/uploads/default/original/3X/1/3/131a71ecf5fd93ce31c5e7cb603cc41196358646.png)
That’s indeed unexpected, and I don’t know a workaround. Please open a ROOT GitHub issue about this if you need to get it fixed. I won’t do so myself because it’s always better if a user reports an issue, they get higher priority ![:slight_smile: :slight_smile:](https://root-forum.cern.ch/images/emoji/twitter/slight_smile.png?v=12)
When producing RVecs in a loop in C++, there is no increase of memory at all. So it could be related to PyROOT indeed. But it must also be related to RVec, because with a std::vector
, you don’t see this non-linear increase in memory consumption with PyROOT.
void repro() {
ProcInfo_t pinfo;
gSystem->GetProcInfo(&pinfo);
double initialMem = pinfo.fMemResident;
for (std::size_t i = 0; i < 10000; ++i) {
ROOT::RVec<double>{};
}
gSystem->GetProcInfo(&pinfo);
double finalMem = pinfo.fMemResident;
std::cout << ( finalMem - initialMem ) << std::endl;
}
0
Cheers,
Jonas