Memory leakage of TTree::Fill()

Hello, all

//how to compile : g++ -o test test.cc `root-config --cflags --libs`
#include "TFile.h"
#include "TTree.h"

unsigned int pid, sid, tof, pl, pr, epoch;
int main()
{

    TFile* rootfile = new TFile("test.root", "recreate");
    TTree* roottree = new TTree("EventTree", "DC-TOF Event Data");
    roottree->Branch("pid", &pid, "pid/i");
    roottree->Branch("sid", &sid, "sid/i");
    roottree->Branch("tof", &tof, "tof/i");
    roottree->Branch("pl", &pl, "pl/i");
    roottree->Branch("pr", &pr, "pr/i");
    roottree->Branch("epoch", &epoch, "epoch/i");

    pid=sid=tof=pl=pr=epoch=100;
    while(true) {
        roottree->Fill();
    }

    roottree->Write();
    rootfile->Close();
}

With the above code, I monitored the memory usage of it using ‘top -p [pid of test]’.
It continued to increase its memory slowly so that it was crashed eventually.
I suspect TTree::Fill(). Is it normal? Or, is there any solution?
I would like to use TTree in my DAQ program running for long time.

I have modified the loop of your program to print the Resident and Virtual memory of your program when it runs.
See the results below where I let the program run 100 million iterations and produce a 1 GByte compressed output file.
It is normal if the program grows very slightly with time because the tables handling the baskets offsets on disk is growing with the amount of data. You can make bigger tables by changing the basket size to bigger values.
But this should be OK even if your program produces a 100 GByte output file.

Output

i=0, totbytes=0, MemRes=15968KB, MemVirt=36732KB i=10000000,totbytes=240000000, MemRes=43152KB, MemVirt=79280KB i=20000000,totbytes=480000000, MemRes=43152KB, MemVirt=79280KB i=30000000,totbytes=720000000, MemRes=43152KB, MemVirt=79280KB i=40000000,totbytes=960000000, MemRes=43152KB, MemVirt=79280KB i=50000000,totbytes=1200000000, MemRes=43152KB, MemVirt=79280KB i=60000000,totbytes=1440000000, MemRes=43152KB, MemVirt=79280KB i=70000000,totbytes=1680000000, MemRes=43152KB, MemVirt=79280KB i=80000000,totbytes=1920000000, MemRes=43168KB, MemVirt=79280KB i=90000000,totbytes=2160000000, MemRes=43168KB, MemVirt=79280KB i=100000000,totbytes=2400000000, MemRes=43168KB, MemVirt=79280KB
loop

pid=sid=tof=pl=pr=epoch=100; Int_t i=0; Long64_t totbytes=0; ProcInfo_t info; while(true) { if (i%10000000 == 0) { gSystem->GetProcInfo(&info); printf("i=%d,totbytes=%lld, MemRes=%dKB, MemVirt=%dKB\n",i,totbytes,info.fMemResident,info.fMemVirtual); } totbytes +=roottree->Fill(); i++; pid++;sid++;tof++;pl++;pr++;epoch++; }
Rene

Thank you for your reply.
My DAQ program with TTree storage is a server receiving events from multiple data production clients.
It is important to keep good performance in throughput. The throughput is high enough at the beginning of it’s running but falls drastically at some critical increased memory.
Anyway, let me know how to adjust the baske size.

Hi,

To increase the buffer size, specify it on the branch creation line:roottree->Branch("pid", &pid, "pid/i", 128000); // default is 32000

‘How long’ does it take for the slow down to appear? What is the size of the file at the time? Which version of ROOT are you trying this with? (Can you try with v5.27/04 ?)

With the trunk, with the default buffer size, I see relatively small memory increase of about 228Kb per 100 millions entries (i.e. about 228Kb per 2.5Gb of real data (and in the this artificial example about 700Mb of file size).

In addition, you can control the maximum size of the file automatically by using:roottree->SetMaxTreeSize(10*1024*1024*1024); // Set maximum size to 10Gb(in which case you also need to change ‘rootfile->Close();’ to ‘roottree->GetDirectory()->GetFile()->Close();’.
See the documentation for TTree::ChangeFile for details.

Cheers,
Philippe.

I figured out the problem of my DAQ program.



As I mentioned, my program suffered a sudden throughput drop(at -100 seconds in the above figure)
At that time, not ony the memory increased rapidly but the cpu occupancy rate also fell down with the saved root file of ~30MByte. So I suspected AutoFlush of TTree and solved the problem disabling AutoFlush using TTree::AutoFlush(0).

Disabling the AutoFlush is very bad! you move the problem to the readers of your files.
See the documentation of TTree::SetAutoFlush and TTree::Fill. You can control the rate at which the buffers are flushed to disk. The default is 30 MBytes. You can change it to smaller values, eg
mytree.SetAutoFlush(nevents); where nevents >0 is the number of entries triggering an autoflush
mytree.setAutoFlush(-5000000); every 5 MBytes the buffers will be flushed.

You can also call mytree.SetAutoSave(nbytes) to increase the value when to checkpoint

Rene

[quote]At that time, not ony the memory increased rapidly but the cpu occupancy rate also fell down with the saved root file of ~30MByte.[/quote]The results are strange. I would expect that there would be a one time cost (well every 30Mbytes) but that in between the rate should be the same. Would you be able to send a complete running (standalone) script/program showing this behavior?

Thanks,
Philippe.

[quote=“pcanal”]The results are strange. I would expect that there would be a one time cost (well every 30Mbytes) but that in between the rate should be the same. Would you be able to send a complete running (standalone) script/program showing this behavior?

Thanks,
Philippe.[/quote]
My program is more or less complicated. I attached simplified files. I’m using ROOT 5.25 and C++ BOOST library 1.40 for my program.
This is the result.


To avoid throughput drop, I used TTree::SetAutoFlush(24000)
client.cc.txt (1.9 KB)
server.cc.txt (3.09 KB)

Hi,

Thanks for the example, I was able to reproduce and fix the problem (in revision 37444 of the trunk).

The issues arise mostly because the data in your example was so homogenous that the compression ration was extremely high (120 !!) triggering a problem in the basket size optimization algorithm (resulting in a serious de-optimization rather than an optimization) ; your same example with ‘random’ data does not have any rate drop-out.

Cheers,
Philippe.