How to handle THnSparse larger than 500Mo

Hi,

I am trying to implement a 3D matrix for gamma-ray spectroscopy of a size of 4096x4096x4096

my problem is that from a certain amount of data, the size of my output file is too important and it seems that root cannot handle it anymore:

Error in TBufferFile::CheckByteCount: object of class TObjArray read too many bytes: 1183220729 instead of 109478905
Warning in TBufferFile::CheckByteCount: TObjArray::Streamer() not in sync with data on file Summed_Runs_076088_to_076227.root, fix Streamer()

This kind of gamma-ray cube has been already developed in other softwares (like radware), but I want to be able to use it with ROOT.

Does someone has an idea to implement this kind of object ?

Thanks in advance

Jérémie

Help on Method/s to create Tree from multi-fold gamma-ray coincidence data

Thanks, but this deals with how to make a Tree to at the end build THnSparse. This is not my problem. I already have my Trees and my THnSparse, And I can play with them with a little amount of data, my problem is to do that on a large statistics

I explain there the error message you get and I also propose a trick using which one can reduce the amount of the required RAM (for a 3 dimensional THnSparse) by (up to) a factor 6.

Yes I already have read all this post before to post my question here. But I already use what you are proposing (using an unsymmetric THnSparse if this is what you are talking about).

Here is the way I am filling my THnSparse:

std::sort(EGamma, EGamma+GammaMult);

for(int i=0 ; i<GammaMult ; i++)
{
    for(int j=0 ; j<i ; j++)
    {
        for(int k=0 ; k<j ; k++)
        {
            hEnG_entry[0] = EGamma[i];
            hEnG_entry[1] = EGamma[j];
            hEnG_entry[2] = EGamma[k];
                
            hnsparse->Fill( hEnG_entry );
        }
    }
}

Using this code, The size of my output file is of 500Mo for 2h hours of data (for a 50 days experiment…)

How much is “GammaMult”? Always 3? Less or Equal 3?

it can go from 1 to 16, but according to this loop, it is filled only for GammaMult>=3 right ?

I think you can use the code from here (“GammaMult” = “No_Clover”):

I think both methods are equivalent, I always fill with i>j>k with ordered energies, and your are filling i<j<k. In a two dimensional view, you are filling above the diagonal and I am filling below (right ?).
Anyway, I have tried with your code and both output files are identical (same size) :frowning:

Do you think it could be possible to avoid this size limit to build a kind of THnSparse in a TTree ? Where the number of entries is the number of filled bins, containing an array of the dimension of the THnSparse. I prefer to have you idea about this before starting to try to implement it.

I don’t really know what is that you actually want to do with your “gammas”.
You can have a simple tree which keeps all your gammas in a “raw format” (i.e. each tree entry could keep all “GammaMult” energies which belong to it, without making any groups of three) and then you can “analyse” this tree, creating 1, 2, 3 dimensional histograms / projections in RAM. As long as you do not try to write these histograms to disk, you can make them as huge as you have RAM (note however that, if you want to draw them, you’d better keep the number of bins small, otherwise it will take a very long time to display it). Of course, each time you create a new histogram, you will need to loop over your tree again (but that’s what ROOT can do well).

What I need at the end is to project on a 1D histogram the gamma rays which are in coincidence with two other one. But this needs to be fast. I cannot read all the entries of a root Tree to build it, it will takes ages for each projection. Ideally I need a THnSparse or equivalent structure in such a way that a projection is made in few seconds. This kind of Cube is used since many years with other softwares( like radware or gaspware). That’s a pity that I cannot do the same thing with root objects.

I don’t perfectly understand how it works but does the TKDTree objects can be a solution to this kind of problem ?

I guess we need @pcanal to state when / if the 1GB ROOT TFile buffer/basket limit will disappear (so that huge histograms could be saved to / retrieved from a ROOT file).

In the meantime … you can try to live with a THnSparseC (1 byte per bin) or a THnSparseS (2 bytes per bin) but, when filling, make sure that you always check that you do not exceed the maximum allowed bin content (i.e. 127 for “Char_t” and 32767 for “Short_t”).

In my case, the limit seems not to come from the 1GB file (mine is 515MB), but the errors comes from the TObjArray size of the THnSparse. Is it the same thing ?

But if this is only related on the TFile Buffer size, is it possible to store a binary file on disk (with file larger than 1GB), and loading it in memory in a THnSparse for analyzing it ?

Indeed I will try with a THnSparseS (that’s a pity that there is no THnSparse for unsigned short…)

// for a "Short_t" based "THnSparseS *hnsparse"
if (hnsparse->GetBinContent(hnsparse->GetBin( hEnG_entry )) < 32767) {
    hnsparse->Fill( hEnG_entry );
  } else std::cout << "SATURATED!" << std::endl;

For such large data, we recommend storing it in a split TTree.

The 1GB limit is challenging to remove and thus is likely to only be lifted in the so-called ROOT v7 format.

Cheers,
Philippe.