Put RooFitResults in TTree; reading extremely slow; bug?

Hi,

for some toy study approach I am massively generating datasets for a given PDF via RooFit. The fit results from the following fits should be saved into a file for later processing. I thought putting all the RooFitResults into a TTree might be a good solution. However, in my approach, reading in the fit results gets increasingly slow over time. As a toy study quickly requires thousands of fits, this problem makes my approach not usable.

To demonstrate, I’ve put a small example online at

e5.physik.tu-dortmund.de/~fkruse/root-ttree/

ToyTestBugMain.cpp is a standalone file able to reproduce the problem. The make.sh builds the binary. run.sh will run the binary 1000 times, therefore filling 1000 fit results into a TTree (into testbug.root) and then run the binary again with a parameter, reading in all results from the tree. One should be able to notice that this gets increasingly slower over time. If one does not want to wait, I also supply a testbug.root online which can be read in by calling ToyTestBugMain with a parameter. I am using ROOT 5.30/01.

The relevant code to reproduce is this:

// include statements skipped, see web link above

RooWorkspace* BuildPDF() {
  // omitted for clarity, can be found online
  // just creates some PDF combination and puts this on a RooWorkspace to return afterwards
  return ws;
}

void StoreResult() {
  RooWorkspace* ws = BuildPDF();
  RooDataSet* data = ws->pdf("pdf_add")->generate(*ws->set("argset_obs"), 1000, Extended(true));
  data->Print();
  RooFitResult* fit_result = ws->pdf("pdf_add")->fitTo(*data, NumCPU(2), Extended(true), Save(true), Strategy(2), Minos(false), Hesse(false), Verbose(false),Timer(true));
  
  TFile f("testbug.root","update");
  TTree* tree_results = NULL;
  tree_results = (TTree*)f.Get("results");
  if (tree_results == NULL) {
    tree_results = new TTree("results", "Tree for toy study fit results");
    tree_results->Branch("fit_results", "RooFitResult", &fit_result, 64000, 0);
  } else {      
    tree_results->SetBranchAddress("fit_results", &fit_result);
  }
  
  tree_results->Fill();
  tree_results->Write("",TObject::kOverwrite);
  f.Close();
}

void ReadFiles() {
  TFile file("testbug.root", "read");
  TTree* tree = (TTree*)file.Get("results");
  
  TBranch* result_branch = tree->GetBranch("fit_results");
  RooFitResult* fit_result_read = NULL;
  result_branch->SetAddress(&fit_result_read);
  std::vector<RooFitResult*> fit_results_(tree->GetEntries());
  
  TStopwatch sw;
  for (int i=0; i<tree->GetEntries(); ++i) {
    result_branch->GetEntry(i);
    
    // save a copy
    sw.Reset();
    sw.Start();
    fit_results_.push_back(new RooFitResult(*fit_result_read));
    std::cout << i << std::endl;
    delete fit_result_read;
    fit_result_read = NULL;
    sw.Stop();
    sw.Print();
  }
  
  delete result_branch;
  delete tree;
}

int main(int argc, char *argv[]) {
  StoreResult();
  if (argc > 1) ReadFiles();
}

StoreResult() will generate a RooDataSet and fit. Afterwards, the RooFitResult will be saved to the TTree (TTree will be created if not existing). Calling ReadFiles() will open the TTree and store the fit results into an std::vector. For each loop the time needed will be printed. This will increase over iterations.

First of all, am I doing something wrong? If not, is this a bug? If this is a bug, how can I save lots of RooFitResults comfortably into some sort of file (as a workaround until this one is fixed)?

After some more testing, this problem does not seem to be directly TTree related. If I split the results into two ROOT files with TTrees, it does not help: After finishing the first file, the time per result read will stay as high and increase more in the second file.

And as far as I see the call of the copy constructor

new RooFitResult(*fit_result_read)

is what’s getting slower and slower.

I may have found a workaround. Instead of doing

    // save a copy
    sw.Reset();
    sw.Start();
    fit_results_.push_back(new RooFitResult(*fit_result_read));
    std::cout << i << std::endl;
    delete fit_result_read;
    fit_result_read = NULL;
    sw.Stop();
    sw.Print();

but

    // save a copy
    sw.Reset();
    sw.Start();
    fit_results_.push_back(fit_result_read);
    std::cout << i << std::endl;
    fit_result_read = NULL;
    sw.Stop();
    sw.Print();

and thus avoid the copy constructor call will work properly. However, as far as I can tell, the first attempt should not generate problems as I am experiencing. So this is a bug?

1 Like