I have some large RooDataHists (~12M Bins) that I would like to import into a RooWorkspace and write to disk.
It’s not a problem to directly save the the RooDataHists into a single root file, but when writing the workspace it crashes with:
Error in TBufferFile::WriteByteCount: bytecount too large (more than 1073741822)
How can I avoid this problem?
Please find attached an example which reproduces the issue.
I would try converting the RooDataHist to a RooDataSet or a TH1 (2 or 3) which should use less memory.
Otherwise maybe @pcanal has a solution on avoiding this error in TBufferFile
I managed to get around this issue for now with some creative use of TH3F and a somewhat reduced number of bins.
Nevertheless, it might be useful to keep an eye on such issues: with ever more powerful computer infrastructure, this limitation will potentially show up more commonly in the future.
We looked into some more details of TBufferFile - the buffer size seems to be limited by the size of UInt_t (https://root.cern.ch/doc/master/TBufferFile_8cxx_source.html). Is there any way one could enlarge this to ULong_t? This unfortunately significantly limits our analysis development. I’d also still be curious to learn if there are other ways to avoid this error. @pcanal - do you think you could help with that?
Unfortunately this is a challenging change as it would change the file format and likely break forward-compatibility (make old version of ROOT unable to read newer version). The naive implementation would also increase the data size (recording the offset would take 8 bytes instead of 4).
So not trivial to implement, however as you point out, we will need to tackle this eventually/soon-ish.
In the meantime, you could try to store the very large object in a TTree (containing one entry) to try to get it to be split.