Investigating memory leak while using TMinuit ROOT with valgrind

I am using TMinuit in a loop for scanning some upper limit maps and I am running into a memory problem. The only thing which is created within the loop is the TMinuit object using “TMinuit * minuit = new TMinuit(n_params);”. This is deleted at the end of the loop using “delete minuit”. I used valgrind and it says something concerning Minuit (just a snippet here), but honestly, I don’t understand that output. My guess was, that freeing memory is reached by “delete minuit”. Obviously, that’s not all… Some suggestions? :slight_smile:

==17564== 46,053,008 (4,227,048 direct, 41,825,960 indirect) bytes in 25,161 blocks are definitely lost in loss record 11,738 of 11,738
==17564==    at 0x4C2E0EF: operator new(unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==17564==    by 0x52D77A8: TStorage::ObjectAlloc(unsigned long) (TStorage.cxx:330)
==17564==    by 0x403601B: ???
==17564==    by 0x4036064: ???
==17564==    by 0x914984F: TClingCallFunc::exec(void*, void*) (TClingCallFunc.cxx:1776)
==17564==    by 0x914A28F: operator() (functional:2267)
==17564==    by 0x914A28F: TClingCallFunc::exec_with_valref_return(void*, cling::Value*) (TClingCallFunc.cxx:1998)
==17564==    by 0x914AC58: TClingCallFunc::ExecInt(void*) (TClingCallFunc.cxx:2095)
==17564==    by 0x53468A8: TMethodCall::Execute(void*, long&) (TMethodCall.cxx:457)
==17564==    by 0x17DDFE20: Execute (TMethodCall.h:136)
==17564==    by 0x17DDFE20: ExecPluginImpl<int, double*, double*> (TPluginManager.h:162)
==17564==    by 0x17DDFE20: ExecPlugin<int, double*, double*> (TPluginManager.h:174)
==17564==    by 0x17DDFE20: TMinuit::mnplot(double*, double*, char*, int, int, int) (TMinuit.cxx:6085)
==17564==    by 0x17DE3C18: TMinuit::mnscan() (TMinuit.cxx:6803)
==17564==    by 0x17DF744D: TMinuit::mnexcm(char const*, double*, int, int&) (TMinuit.cxx:2977)
==17564==    by 0x17DD9235: TMinuit::mncomd(char const*, int&) (TMinuit.cxx:1382)
==17564==    by 0x178CA910: ULcoh(int, int) (in /mnt/scr1/user/j_blom02/analysis/phikk/ul/ulmaps_C.so)
==17564==    by 0x178CADA4: ulmaps(bool, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, int, int) (in /mnt/scr1/user/j_blom02/analysis/phikk/ul/ulmaps_C.so)
==17564==    by 0x4032084: ???
==17564==    by 0x918588B: cling::Interpreter::RunFunction(clang::FunctionDecl const*, cling::Value*) [clone .part.290] [clone .constprop.445] (in /mnt/scr1/user/bes3/root/build_v6_14_08/lib/libCling.so)
==17564==    by 0x918A362: cling::Interpreter::EvaluateInternal(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, cling::CompilationOptions, cling::Value*, cling::Transaction**, unsigned long) (in /mnt/scr1/user/bes3/root/build_v6_14_08/lib/libCling.so)
==17564==    by 0x918A60B: cling::Interpreter::process(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, cling::Value*, cling::Transaction**, bool) (in /mnt/scr1/user/bes3/root/build_v6_14_08/lib/libCling.so)
==17564==    by 0x9217886: cling::MetaProcessor::process(llvm::StringRef, cling::Interpreter::CompilationResult&, cling::Value*, bool) (in /mnt/scr1/user/bes3/root/build_v6_14_08/lib/libCling.so)
==17564==    by 0x90FB3D9: HandleInterpreterException(cling::MetaProcessor*, char const*, cling::Interpreter::CompilationResult&, cling::Value*) (TCling.cxx:2060)
==17564==    by 0x911033D: TCling::ProcessLine(char const*, TInterpreter::EErrorCode*) (TCling.cxx:2177)
==17564==    by 0x91022A2: TCling::ProcessLineSynch(char const*, TInterpreter::EErrorCode*) (TCling.cxx:3053)
==17564==    by 0x5272649: TApplication::ExecuteFile(char const*, int*, bool) (TApplication.cxx:1157)
==17564==    by 0x52735F5: TApplication::ProcessLine(char const*, bool, int*) (TApplication.cxx:1002)
==17564==    by 0x4E4A183: TRint::ProcessLineNr(char const*, char const*, int*) (TRint.cxx:756)
==17564==    by 0x4E4B956: TRint::Run(bool) (TRint.cxx:416)
==17564==    by 0x400999: main (rmain.cxx:30)

Hi @JohannesB,
and welcome to the ROOT forum!

From the valgrind stacktrace it looks like TMinuit::mnplot (which is called indirectly by ULcoh in your ROOT macro) allocates some storage within a call that is just-in-time compiled via the cling interpreter. In other words: it does not look like it’s your code’s fault.

@moneta might be able to suggest a workaround. Alternatively, can you please provide a small, self-contained snippet of code that reproduces the memory problem?

Cheers,
Enrico

Hi @eguiraud !
The code snippet is here. After the loop, h_scan is integrated and deleted after that. This loop is further implemented into one where masses and widths are scanned. The overall ouput is a map where i can plot the integral of h_scan with masses on y-axis and widths on x-axis :slight_smile:

MinuitFCN_ContinuumBWcoh is just my void with the fit function etc.

Many thanks!

  for (int k = 0; k < scan_n; k++)
  {
    A_i = h_scan->GetXaxis()->GetBinCenter(k+1); //this is an amplitude, which will be fixed for the fit (because it is the scan variable)

    TMinuit * minuit = new TMinuit(n_params);
    minuit->SetFCN(MinuitFCN_ContinuumBWcoh);

    minuit->SetMaxIterations(5000);

    par[0] = lastC;
    par[1] = lastLambda;
    par[2] = mass;
    par[3] = width;
    par[4] = A_i;
    par[5] = lastPhi;

    err[0] = 0.01;
    err[1] = 0.01;
    err[2] = 0.01;
    err[3] = 0.01;
    err[4] = 0.01;
    err[5] = 0.01;

    min[0] = 1.;
    min[1] = 1.;
    min[2] = 0.;
    min[3] = 0.;
    min[4] = 0.;
    min[5] = 0. * TMath::Pi();

    max[0] = 10.;
    max[1] = 10.;
    max[2] = 5.;
    max[3] = 0.5;
    max[4] = 100.;
    max[5] = 2. * TMath::Pi();

    // Define start parameters
    minuit->mnparm(0, "C",      par[0], err[0], min[0], max[0], ierflg);
    minuit->mnparm(1, "lambda", par[1], err[1], min[1], max[1], ierflg);
    minuit->mnparm(2, "mass",   par[2], err[2], min[2], max[2], ierflg);
    minuit->mnparm(3, "width",  par[3], err[3], min[3], max[3], ierflg);
    minuit->mnparm(4, "A",      par[4], err[4], min[4], max[4], ierflg);
    minuit->mnparm(5, "phase",  par[5], err[5], min[5], max[5], ierflg);

    minuit->mnexcm("SET NOW", arglist, 1, ierflg);
    arglist[0] = 1;
    minuit->mnexcm("SET ERR", arglist, 1, ierflg);
    arglist[0] = 2;
    minuit->mnexcm("SET STR", arglist, 1, ierflg);
    minuit->mnexcm("CALL FCN", arglist, 1, ierflg);

    // Fix some parameters
    minuit->FixParameter(2);
    minuit->FixParameter(3);
    minuit->FixParameter(4);

    // Scan all free parameters to set a new phi window - this for loop is for test reasons set from 0 to 1
    for(int l = 0; l<1; l++){
      minuit->mncomd("scan 0 1000", ierflg);
      minuit->GetParameter(5, par[5], err[5]);
      min[5] = par[5] - TMath::Pi();
      max[5] = par[5] + TMath::Pi();
      minuit->mnparm(5, "phase", par[5], err[5], min[5], max[5], ierflg);
      /*minuit->GetParameter(0, par[0], err[0]);
      min[0] = par[0]*0.99;
      max[0] = par[0]*1.01;
      minuit->mnparm(0, "C", par[0], err[0], min[0], max[0], ierflg);
      minuit->GetParameter(1, par[1], err[1]);
      min[1] = par[1]*0.99;
      max[1] = par[1]*1.01;
      minuit->mnparm(1, "lambda", par[1], err[1], min[1], max[1], ierflg);*/
    }
    
    // Perform maximum loglikelihood fit
    minuit->Migrad();

    minuit->mnstat(minfcn, edm, errdef, npari, nparx, istat);
    minuit->GetParameter(0, lastC, best_c_err);
    minuit->GetParameter(1, lastLambda, best_lambda_err);
    minuit->GetParameter(5, lastPhi, err[5]);

    delete minuit;
    minuit = 0;

    L_i = TMath::Exp(-0.5 * (minfcn - best_minfcn));
    
    h_scan->SetBinContent(k+1, L_i);
    
  }

Hi Johannes,
thank you, I believe that’s a reproducer but unfortunately it’s not self-contained (nor minimal :smile: ).
Could you please trim it down to the essential lines to reproduce the problem and make it so that we can run it (e.g. the current reproducer is missing definitions of A_i, h_scan, n_params, etc).

Cheers,
Enrico

Hi @eguiraud,

sure, sorry :smiley:

My code is attached. I run it with following command:
valgrind --num-callers=30 --suppressions=my.supp --leak-check=full --track-origins=yes root.exe -l -b -q ulmaps_test.C+(0,0)

This results in the valgrind output I posted above. Again, many thanks for your time!!

Cheers,
Johannesulmaps_test.C (9.0 KB) ulmaps_test.h (7.3 KB)

Hi,

It is maybe an issue with the Graph produced by calling mnscan that calls mnplot. Try switching of the graphics mode, by doing after the constructor

minuit->fGraphicsMode = false; 

I would suggest you to use anyway the new implementation of Minuit2 to run more complex fits and scans.

Lorenzo

1 Like

Hi @moneta,

I’ve deleted my previous reply to give an update. I turned the GraphicsMode off. The memory leak due to Minuit is gone :slight_smile:

It seems that mnplot is coded in a way, that you either create a lot of plots, which you can get by
TGraph gr = (TGraph)gMinuit->GetPlot();

gr->Draw(“al”);.

Or, you have to deal with some print in your terminal. But this is much easier to handle of course.
Many thanks for your comment on my question!

Cheers,
Johannes

1 Like