Possible memory leak in `fitTo` with ROOT 6.24

I’m using RooFit to fit a PDF from HistFactory. This worked with ROOT 6.22/08 but with 6.24/00 it appears to hang and memory use climbs (slowly but surely. I killed the process when it reached 7 GB).

f = ROOT.TFile.Open("wkspace_scalar_1100.root")
w = f.Get("combWS")
data = w.data("combData")
n_evts = int(data.sumEntries() + 1)
pdf = w.pdf("combPdf")
params = pdf.getParameters(data)
observables = data.get()
weight_var = ROOT.RooRealVar("_weight_", "_weight_", 1, 0., 1e7)
observables.add(weight_var)
print("\n[blue bold]Observables [/blue bold]")
print(
    "\n".join([varToStr(o) for o in observables]),
)
data.Print()
print(f"\n[blue bold]Events per dataset: {n_evts}[/]")
print(f"[blue bold]Num entries: {data.numEntries()}[/]")
print(f"[blue bold]Expected events: {pdf.expectedEvents(observables)}[/]")
rp.Prompt.ask("Press Enter to continue")
print("\n")
print("[green bold]Running fit...[/]")
fit_start = perf_counter()
poi = params.selectByName("xsec_br").first()
poi.setVal(0.0)
poi.setConstant(True)
res = pdf.fitTo(
    data,
    ROOT.RooFit.Offset(True),
    ROOT.RooFit.BatchMode(True),
    ROOT.RooFit.Save(True),
)
print("[blue bold]Parameters [/blue bold]")
print(
    "\n".join([varToStr(p) for p in params]),
)
print(f"Expected events: {pdf.expectedEvents(observables)}")
print(f"Took {perf_counter() - fit_start}s")
rp.Prompt.ask("Press Enter to continue")
print("[green bold]Generating toys...[/]")
w2 = ROOT.RooWorkspace("toys", "toys")
start = perf_counter()
spec = pdf.prepareMultiGen(
    observables,
    ROOT.RooFit.Name("toyData"),
    ROOT.RooFit.NumEvents(n_evts),
    ROOT.RooFit.Verbose(True),
    ROOT.RooFit.Extended(),
    ROOT.RooFit.AllBinned()
)

Hi @beojan ,
let’s ping the RooFit experts @moneta @jonas .

Cheers,
Enrico

Hi,

Can you please post your input ROOT file and (in case it is not completed) your macro showing the problem, so we can reproduce and investigate it

Thanks

Lorenzo

That’s a bit difficult because that workspace contains unblinded ATLAS data from an unpublished analysis.

I could try the example workspace but I can’t find data/example.root.

If we are lucky you might be able to pinpoint the exact line at which the leak happens by running the reproducer under valgrind --suppressions=$ROOTSYS/etc/valgrind-root.supp --track-origins=yes <reproducer command> and/or valgrind --suppressions=$ROOTSYS/etc/valgrind-root.supp --tool=massif <reproducer command>. You will need a build of ROOT that contains debug symbols, I think LXPLUS has some.

Otherwise, without a reproducer it’s hard to make progress.

Is the example.root file available somewhere?

@moneta can correct me if I’m wrong, I think it’s produced by $ROOTTUTDIR/histfactory/makeExample.C or similar.

Anyhow, I was able to run it under valgrind. It looks to me like it’s an std::map that’s causing the leak.
The Massif snapshots are attached.
massif2.vgdb.tar.gz (72.8 KB)

Can you please try with a build of ROOT with debug symbols, so we get line numbers in the valgrind logs? Also valgrind memcheck (i.e. not massif) should point exactly to what leaks if it sees something, which might be a more direct clue than massif (which simply lists all allocations).

Cheers,
Enrico

Hi,

If you cannot upload the file, can you post the code reproducing the problem with the histfactory tutorial generated model, e.g. one of the results/example_combined_XXX_model.root

We have noticed a memory leak , but from 6.14 to 6.16 using one of those files, but when generating toys not in fitting, see Memory leak when running FrequentistCalculator scan in RooStats · Issue #7890 · root-project/root · GitHub

Lorenzo

Here’s the example file and code. This model is small enough that the fit finishes almost instantly, which makes it hard to verify if the growth in memory use occurs here.

wk_issue_1.tar.gz (43.7 KB)

but when generating toys not in fitting

Ironically I’m doing this fit in order to generate a set of toys with the fit model. That said, the hang and increasing memory usage do occur in the fitting step, and I’m not using the management classes that FrequentistCalculator uses.

Hi,

Thank for posting the code. I think this is caused mainly by Memory leak when using MemPoolForRooSets · Issue #7933 · root-project/root · GitHub , whic is being investigated.
Although that issue is present since 6.14, and you are saying that you did not have problems in 6.22.
I cannot run your code in 6.22, I would need probably a root file generated for 6.22. It would be good if you can share the code for 6.22,
thanks
Lorenzo

I don’t think I had to make any changes to the code between 6.22 and 6.24.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.