Memory Leak

Hello,

I’m trying to use Proof to run a reweighing fit on a 2-dimensional Histogram.
Now I have to call Process several times on the same tree (in my fcn-function for Minuit) to do that. But the histogram I fill with the Proof-Process causes a memory leak of several megabyte, so that the program crashes once it runs out of memory.

Attached are my TSelector and a small program that causes the memory leak. (Instead of minuit I just call the process in a loop).

Program, compiles or runs in root

memleak.C

[code]#include “getProof.C”//This is the getProof.C from the tutorial, was //the easiest way to get Proof to work for me
#include <TChain.h>
#include <TH2D.h>
#include <TCanvas.h>

TProof *proof;
TChain *chain;

void memleak();

int main() {
memleak();
}

void memleak() {
//Start up proof
TString tutdir = “/tmp/sailer”;
if (gSystem->AccessPathName(tutdir)) {
Printf(“runProof: creating the temporary directory”
" for the tutorial (%s) … “, tutdir.Data());
if (gSystem->mkdir(tutdir, kTRUE) != 0) {
Printf(“runProof: could not assert / create the temporary directory”
" for the tutorial (%s)”, tutdir.Data());
return;
}
}
proof = getProof(“proof://localhost:11093”, -1, tutdir.Data(), “restart”);
chain = new TChain(“T”);
chain->Add("/tmp/sailer/fit2corrected.root");
chain->SetProof();
for(Int_t k = 0; k < 5; k++){
chain->Process(“leaksel.C+”);
}
gSystem->Exit(0);
return;
}[/code]

leaksel.C

[code]
#define leaksel_cxx
#include “leaksel.h”
#include <TH2.h>

void leaksel::Begin(TTree * /tree/)
{
}

void leaksel::SlaveBegin(TTree * /tree/)
{
temp2d = new TH2D(“h2”,“prediction”, 1000, 0.6, 1.003, 1000, 0.6, 1.001);
fOutput->Add(temp2d);
}

Bool_t leaksel::Process(Long64_t entry)
{
fChain->GetEntry(entry);
temp2d->Fill(scaltree, srectree);
return kTRUE;
}

void leaksel::SlaveTerminate()
{
//This prevents memory leak, but then I can’t analyse in Terminate
//delete temp2d;

}

void leaksel::Terminate()
{
//Analyse temp2d here, calculate chi2 with other histogram coming as
//input
//temp2d = (TH2D*)fOutput->FindObject(“h2”);
}[/code]

Thanks in Advance,
André
leaksel.h (3.13 KB)

Hi André,

Have you tried doing the same processing in ROOT without PROOF enabled (sequentially)? No memory leak?

Jan

Dear André,

Thanks for reporting this problem due to a wrong scoping in the destructor in charge of destroying the partial output list on the worker.

The problem is fixed in the SVN trunk.

If you cannot work with the trunk the following workaround should work:

void leaksel::SlaveBegin(TTree * /*tree*/)
{
  temp2d = (TH2D *) gDirectory->FindObject("h2");
  if (temp2d) delete temp2d;

  temp2d = new TH2D("h2","prediction", 1000, 0.6, 1.003, 1000, 0.6, 1.001);
  fOutput->Add(temp2d);

  if (!gDirectory->FindObject("h2"))
      gDirectory->Add(temp2d);
}

Please try and let me know.

G. Ganis

[quote=“jani”]Hi André,
Have you tried doing the same processing in ROOT without PROOF enabled (sequentially)? No memory leak?
Jan[/quote]
Hi Jan and Ganis,

If i run the program without proof I just book 2 histograms, and reset the histogram before I fill it again, so there is no memory leak, but I might not have noticed a memory leak, because I used a one dimensional histogram and less bins before.

I also use less bins now, so that I no longer run out of memory, but it is still leaking only just kilobytes instead of megabytes.

I can’t use the trunk and the Workaround does not seem to work completely.
But instead of all Proofserv tasks it seems that only one of them is leaking (Column RES).

PID USER PR NI %CPU TIME+ %MEM VIRT RES SHR S COMMAND 20847 sailer 17 0 67 0:13.59 2.2 225m 180m 10m R proofserv.exe 20857 sailer 25 0 64 0:07.73 0.5 72256 36m 11m R proofserv.exe 20832 sailer 15 0 7 0:02.11 1.5 165m 120m 15m S root.exe 20863 sailer 16 0 6 0:01.82 0.4 62596 33m 11m S proofserv.exe 16825 sailer 15 0 5 0:40.35 26.5 2163m 2.1g 1768 S xrootd 20859 sailer 16 0 5 0:05.33 0.4 62624 33m 11m S proofserv.exe 20861 sailer 15 0 5 0:04.13 0.4 62008 33m 11m S proofserv.exe
Before all of them used the same amount of memory.

Thanks again,
André

Dear André,

The “leak” that you see is the second part of the fix I did in the trunk for which I did not have a workaround.
These are the histograms resulting from the merge which are kept in TQueryResults on the master.
In the trunk I have introduced a better handling of those, but in previous versions they just stay there.

Thinking again I have found a workaround working for versions >= 5.15.04.
Assuming that this is your case, attached is a small macro to be run after each query in this way: the first time

root [] gProof->Process(...);
root [] gProof->SetParalle(0);
root [] gProof->Load("RemoveObj.C+")  // only one time
root [] gProof->Exec("RemoveObj()") 

and then

root [] gProof->Process(...);
root [] gProof->SetParalle(0);
root [] gProof->Exec("RemoveObj()") 

This should remove the large histogram “h2”; you should edit RemoveObj.C to remove another or additional histograms.

Please try and let me know,

G. Ganis
RemoveObj.C (191 Bytes)

Hi Ganis,

I am using version 5.18/00
This new Workaround is instead of the other one, right?
If I add the lines to the sample program it crashes when the process is called again.
If I remove the

I get the following error:

Error: Function RemoveObj() is not defined in current scope (tmpfile):1: ***Interpreter error recovered***
But the programs runs fine (except for the memory).

I will see if I can get the trunk version installed.
Thanks again,
André

Hi

No, this is in addition; there were two problems: one on the workers (first workaround) and one on the master (second workaround).

I am surprised about the crashes.

Attached is the code that I run in my test with 5.18.00-patches (should be equivalent to 5.18.00). You should run it as

root [] .x testMultiRun.C(<N>)

where is the number of times to call Process.

Try and let me know.

G. Ganis
leaksel.tar.gz (1.95 KB)

I just want to say that I have been very pleased with the discovery of this memory leak. I looked for this a long time (since 5.16), but did not have a good documentation for it . Thanks to the advice on using the trunk, I just took the changes for TProofServ.cxx with the latest
SafeDelete(pq) in the cleanup for the request, not a complete trunk into our root distribution and the long standing issue with mem leaks running multiple times in the same proof session over either different TTrees or ifferent parameters.

Hello,

The testMultirun() works without any errors.

Before I had the Setparallel(0) after I loaded the removeobj, so it was loaded on the slaves, too.

The memory leak seems to be gone, too!
Thank you very much!

André

Hello Again,

I am using the trunk now.

I think there is an additional memory leak concerning the Input of objects into proof.

If you just add

TH2D *histo = new TH2D("histo", "histo", 1000, 0.0, 1.0, 1000, 0.0, 1.0); p->AddInput(histo);
before the loop in ‘testMultiRun.C’ . And then run the program as before.
This causes a memory leak.

Do I have to delete the input in a similar way and add it every loop, or is there a different solution?

Thanks,
André

Yes, the input list was not correctly cleaned up on the workers.
Thanks for bringing this up.

It should be fixed on trunk.

A possible workaround should be to delete the histogram in SlaveBegin as done for the output one.

Let me know if you try.

G. Ganis

Hello,

I installed trunk 23759 and now using proof doesn’t work anymore.
The same programs work in 5.18.

This happens even for an empty selector.

Could this be a problem with my installation?

I’ve emptied the testMultirun and the leaksel.C and ran the program.

See in the attach file where I have also included the output.

Thanks,
André
proof.txt (7.46 KB)