Memory Requirements for RooDataHist and RooDataSet

Dear All,

I am new to RooFit and I am trying to fit azimuthal modulations in the fragmentation of hadrons.
These are large datasets that are binned in several kinematic variables.

At first, I tried to do an unbinned LH fit putting the data in a RooDataSet. Each entry had 7 entries. Three bins and 4 angles. I was planning to fit the counts differential in the 4 angles.
However, this quickly ran into memory allocation errors. Since the internal data structure is a TTree I was hoping it would be swapped out to disk, but no.

The second attempt was using binned histograms. These are histograms with ~1M bins. So my naive estimate was, given that each entry is a double and maybe there are members to keep track of weights etc. that each histogram should use at most about 100MB memory. However, the memory usage is an order of magnitude larger and leads to a memory allocation seg fault as well.

So my question is twofold:
-What memory usage should I expect per entry of RooDataHist?
-Is there a procedure to process large datasets that are differential in many variables. Either unbinned using RooDataSet or using RooDataHist but with many bins?

Thanks,

Anselm

Hi. Thanks for your question. We’ll follow up here as soon as possible.

Hi,

RooFit needs to have all the data in memory for fitting, even if you are using a TTree base data store (which is not the default case).
For an histogram you should be able to fit 1M bin if is a 1d-histogram. There could be some overhead in RooFit, and some un-needed copies which are done. I could investigate this problem, but I woould need you to post an example

Best Regards

Lorenzo

Dear Lorenzo,

thank you very much for your answer. It was helpful to hear that all data has to fit into memory in any case.
I streamlined my code a bit by not saving the bins in the RooDataSet but instead do the binning on the fly for each fit.
This seems to work for now.

Best regards,

Anselm