Hi everyone,
I have been using RooFit for a while now with some great results.
Now Iām trying to scale up to using āreal sizeā datasets which are large: about 10^12 points (until now I have been working with sets around 10^7 in size or so.)
I am wondering: what is the best way to get this data to RooFit as quickly as possible? I am using compiled C++, and have this data available as just arrays of doubles. Suppose I have only 1D data.
I could get the right answer with a tight copy loop:
for(int i=0; i<100000; i++) {
rv = mydata[i];
histo.add(RooArgSet(rv));
}
for rv a RooRealVar, and histo a RooDataHist. This is what my program does now, and it works. However, itās a bit slow for larger amounts of data. Is there some way I can tell RooFit āhere is a pointer to an array of this many doubles, please bin this data as quickly as you canā? Or should I be binning/preprocessing this data myself, and only giving RooFit reduced data?