Possible memory leak in `fitTo` with ROOT 6.24

beojan · April 16, 2021, 1:20pm

I’m using RooFit to fit a PDF from HistFactory. This worked with ROOT 6.22/08 but with 6.24/00 it appears to hang and memory use climbs (slowly but surely. I killed the process when it reached 7 GB).

f = ROOT.TFile.Open("wkspace_scalar_1100.root")
w = f.Get("combWS")
data = w.data("combData")
n_evts = int(data.sumEntries() + 1)
pdf = w.pdf("combPdf")
params = pdf.getParameters(data)
observables = data.get()
weight_var = ROOT.RooRealVar("_weight_", "_weight_", 1, 0., 1e7)
observables.add(weight_var)
print("\n[blue bold]Observables [/blue bold]")
print(
    "\n".join([varToStr(o) for o in observables]),
)
data.Print()
print(f"\n[blue bold]Events per dataset: {n_evts}[/]")
print(f"[blue bold]Num entries: {data.numEntries()}[/]")
print(f"[blue bold]Expected events: {pdf.expectedEvents(observables)}[/]")
rp.Prompt.ask("Press Enter to continue")
print("\n")
print("[green bold]Running fit...[/]")
fit_start = perf_counter()
poi = params.selectByName("xsec_br").first()
poi.setVal(0.0)
poi.setConstant(True)
res = pdf.fitTo(
    data,
    ROOT.RooFit.Offset(True),
    ROOT.RooFit.BatchMode(True),
    ROOT.RooFit.Save(True),
)
print("[blue bold]Parameters [/blue bold]")
print(
    "\n".join([varToStr(p) for p in params]),
)
print(f"Expected events: {pdf.expectedEvents(observables)}")
print(f"Took {perf_counter() - fit_start}s")
rp.Prompt.ask("Press Enter to continue")
print("[green bold]Generating toys...[/]")
w2 = ROOT.RooWorkspace("toys", "toys")
start = perf_counter()
spec = pdf.prepareMultiGen(
    observables,
    ROOT.RooFit.Name("toyData"),
    ROOT.RooFit.NumEvents(n_evts),
    ROOT.RooFit.Verbose(True),
    ROOT.RooFit.Extended(),
    ROOT.RooFit.AllBinned()
)

eguiraud · April 19, 2021, 8:09am

Hi @beojan ,
let’s ping the RooFit experts @moneta @jonas .

Cheers,
Enrico

moneta · April 19, 2021, 8:48am

Hi,

Can you please post your input ROOT file and (in case it is not completed) your macro showing the problem, so we can reproduce and investigate it

Thanks

Lorenzo

beojan · April 19, 2021, 9:22am

That’s a bit difficult because that workspace contains unblinded ATLAS data from an unpublished analysis.

beojan · April 19, 2021, 9:44am

I could try the example workspace but I can’t find data/example.root.

eguiraud · April 19, 2021, 9:45am

If we are lucky you might be able to pinpoint the exact line at which the leak happens by running the reproducer under valgrind --suppressions=$ROOTSYS/etc/valgrind-root.supp --track-origins=yes <reproducer command> and/or valgrind --suppressions=$ROOTSYS/etc/valgrind-root.supp --tool=massif <reproducer command>. You will need a build of ROOT that contains debug symbols, I think LXPLUS has some.

Otherwise, without a reproducer it’s hard to make progress.

beojan · April 19, 2021, 10:00am

Is the example.root file available somewhere?

eguiraud · April 19, 2021, 10:15am

@moneta can correct me if I’m wrong, I think it’s produced by $ROOTTUTDIR/histfactory/makeExample.C or similar.

beojan · April 19, 2021, 10:33am

Anyhow, I was able to run it under valgrind. It looks to me like it’s an std::map that’s causing the leak.
The Massif snapshots are attached.
massif2.vgdb.tar.gz (72.8 KB)

eguiraud · April 19, 2021, 11:03am

Can you please try with a build of ROOT with debug symbols, so we get line numbers in the valgrind logs? Also valgrind memcheck (i.e. not massif) should point exactly to what leaks if it sees something, which might be a more direct clue than massif (which simply lists all allocations).

Cheers,
Enrico

moneta · April 19, 2021, 12:22pm

Hi,

If you cannot upload the file, can you post the code reproducing the problem with the histfactory tutorial generated model, e.g. one of the results/example_combined_XXX_model.root

We have noticed a memory leak , but from 6.14 to 6.16 using one of those files, but when generating toys not in fitting, see Memory leak when running FrequentistCalculator scan in RooStats · Issue #7890 · root-project/root · GitHub

Lorenzo

beojan · April 19, 2021, 4:21pm

Here’s the example file and code. This model is small enough that the fit finishes almost instantly, which makes it hard to verify if the growth in memory use occurs here.

wk_issue_1.tar.gz (43.7 KB)

but when generating toys not in fitting

Ironically I’m doing this fit in order to generate a set of toys with the fit model. That said, the hang and increasing memory usage do occur in the fitting step, and I’m not using the management classes that FrequentistCalculator uses.

moneta · April 20, 2021, 3:13pm

Hi,

Thank for posting the code. I think this is caused mainly by Memory leak when using MemPoolForRooSets · Issue #7933 · root-project/root · GitHub , whic is being investigated.
Although that issue is present since 6.14, and you are saying that you did not have problems in 6.22.
I cannot run your code in 6.22, I would need probably a root file generated for 6.22. It would be good if you can share the code for 6.22,
thanks
Lorenzo

beojan · April 20, 2021, 4:30pm

I don’t think I had to make any changes to the code between 6.22 and 6.24.

system · May 4, 2021, 4:30pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.