I’m afraid @pcanal would need to tell if it is possible to get a “better” branch splitting (I was trying to impose the “max” one).
Well, you could implement the following “brutal fix”.
As I understand, your whole experimental data “sample” produces a THnSparse which is too big to be stored in a ROOT file.
So, try to “split” / “divide” your whole experimental data “sample” into several “subsamples” (or even several tens of “subsamples”, if needed).
Each “subsample” would then (possibly / hopefully) produce a much smaller “partial” THnSparse and you should be able to store these “partial” histograms in a ROOT file, either directly as separate objects or in a TTree. You could create a single ROOT file with all “partial” histograms or one ROOT file per “partial” histogram.
So, if your raw experimental data are spread across multiple files, you could take each raw experimental data file as one physical “subsample” or, if you have just one single file with all raw experimental data, simply divide the total number of events by some number and create that many logical “subsamples” (or one “subsample” per an hour or a day or a week of measurements).
Another (quite clever) way to split your data into “subsamples” would be to monitor the actual total number of bins of your THnSparse, when you fill it (THnSparse::GetNbins). Once this number reaches a certain maximum value (defined by you, it should be small enough that you can still save this histogram in a ROOT file, let’s say 10 to 50 million bins could be fine, I guess), you simply write the current “partial” THnSparse histogram to a ROOT file, then you recreate the THnSparse histogram (or reset it so that all previous bins are gone) and continue the filling with this next “partial” histogram.
For test purposes, I created some 4096x4096x4096 3 dimensional THnSparse histograms and I filled them in a random way (with random values). I have found that the average TFile buffer/basket size needed by such histograms can easily be estimated as follows. For histograms filled without weights one needs “number_of_filled_bins * (sizeof(bin) + 5)”, while for histograms for which THnSparse::Sumw2()
has been called one needs “number_of_filled_bins * (sizeof(bin) + 13)” (i.e. weights are always “Double_t”), where the “sizeof(bin)” is 8 for “Double_t” and “Long_t” and 4 for “Float_t” and “Int_t” and the “number_of_filled_bins” is given by THnSparse::GetNbins()
.
Then, you just need a simple small ROOT macro, which reads / retrieves all “partial” histograms (from a single or from many ROOT files) and adds them in RAM. Well, you will always need to run this macro at the beginning of your ROOT session, of course … but that should really be very fast.