Best way to deal with unknown range of variable for histogram axis

Hi,

suppose I want to populate a histogram where I don’t know in advance what the precise range of the variable is, but I do know the bin width that I need.

I also need to have the possibility to hadd files with two such histograms, where the variable may possibly have different ranges in the two histograms.

How do I deal with this?

Of course I can create a histogram with a large number of bins with a wide range, but that gets very expensive very quickly in terms of memory.

Also it seems I can’t use TH1::SetCanExtend, because that will have the number of bins fixed (rather than the bin width).

Maybe I can use TH1::LabelsInflate before filling, but it looks like it only works by extending the axis in the positive direction, so I will need to fix some sort of minimum, and I’m back at square one (if the minimum of the range turns out to be larger than the minimum I fixed, LabelsInflate will generate a lot of useless bins…)

Thank you in advance!

Hi,

I think in this case you would need to compute the number of bins, min and max yourself. ROOT supports only the case where the number of bins is fixes.
You can always use the TH1 buffer to store your data while you are filling the histograms and instead of using the automatic computation of the axis, you provide your own one.
Here is an example:

{
auto h1 = new TH1D("h1","h1",100,1,0);
h1->SetBuffer(1000000);   // set buffer size larger than number of entries
for (int i = 0; i < 100000; i++) { h1->Fill(gRandom->Gaus(10,5)); }
 // using h1->Draw() we compute automatic min and max of axis 
h1->Draw(); 
auto x1=h1->GetXaxis()->GetXmin();
auto x2=h1->GetXaxis()->GetXmax();
// recompute number of bins given fixed bin width
double binwidth=0.1; 
int nbins = (x2-x1)/binwidth;
// set new axis
h1->SetBins(nbins,x1,x2);
// histogram will be re-filled with new axis
h1->Draw();
}

Lorenzo

Thank you very much, that works very well!

Just a clarification: do I understand correctly that I can substitute both calls to Draw() with
h1->BufferEmpty(-1); if I don’t need graphics at this stage?

Yes , this is correct!

Sorry, I just realized that this won’t be any good for the kind of data I have.

Let’s say my data are something like:

3000, 3000, 3000, ....,
3001, 3001, 3001, ....,
3002, ...,
...

It will be much more inefficient to allocate an enormous buffer to accomodate every single entry, rather than just having one big histogram with very large limits, especially since the number of entries is much larger than the possible number of bins…

If there is no other solution, as a workaround, is there a way I just stay with my big histogram while filling, then throw away all the empty bins at the two ends of the axis? Something like TH1::SetRange(), but actually removing the bins instead of just hiding them. Maybe in the 1D case TH1::Rebin() and passing an array of bins will do the job, but my actual case is 2D and it looks like I can’t pass arrays of bins to Rebin() in that case…