Smooth 2D histogram

Hi,
Is there any macro/function or method available to smooth a 2D histograms? I have found this only for 1D histogram.

Thanks in advance.
Best regards
Andrea Bulgarelli

We do not have 2-d histograms smoothing functions.
If somebody succeeds with an implementation let us know.

Rene

Rene: This code fragment smoothes a TH2 using a fixed kernel.

Admittedly the code has some problems: [ul] [li] c-like, rather than c+±like parlance[/li]
[li] no error checking of the input[/li]
[li] hard-coded kernel (probably should allow the user to pass the kernel to use. May need another class for this)[/li]
[li] special treatment for zero valued bins (needed for my problem, but probably not generally applicable)[/li]
[li] no thought given to efficency.[/li]
[li] In general we only want to deal with the error channel if the user want to use it (i.e. SumW2())[/li]
[li] No attention is paid to or use made of the over- and under-flow bins.[/li][/ul] but it was a quick hack to solve a simple problem.

On the plus side the automatic normalizing behavior deals with the edge and corner bins reasonably well.

If this is what you have in mind, I can clean it up bit. [code]
void smooth_th2(TH2 *h){
// Sample kernel
const Int_t ksize_x=5;
const Int_t ksize_y=5;
Double_t kernel[ksize_x][ksize_y] = { { 0, 0, 1, 0, 0 },
{ 0, 2, 2, 2, 0 },
{ 1, 2, 5, 2, 1 },
{ 0, 2, 2, 2, 0 },
{ 0, 0, 1, 0, 0 } };

// Determine the size of the bin buffer(s) needed
Int_t lowest_bin=h->GetBin(0,0);
Int_t highest_bin=h->GetBin(h->GetNbinsX()+1,h->GetNbinsY()+1);
Int_t bufSize = highest_bin-lowest_bin+1;
Double_t *buf = new Double_t[bufSize];
Double_t *ebuf = new Double_t[bufSize];

// Copy all the data to the temporry buffers
for (Int_t i=0; i<=h->GetNbinsX(); i++){
for (Int_t j=0; j<=h->GetNbinsY(); j++){
Int_t bin = h->GetBin(i,j);
buf[bin] =h->GetBinContent(bin);
ebuf[bin]=h->GetBinError(bin);
};
};

// Kernel tail sizes (kernel sizes must be odd for this to work!)
Int_t x_push = (ksize_x-1)/2;
Int_t y_push = (ksize_y-1)/2;

// main work loop
for (Int_t i=1; i<=h->GetNbinsX(); i++){
for (Int_t j=1; j<=h->GetNbinsY(); j++){
Double_t content = 0.0;
Double_t error = 0.0;
Double_t norm = 0.0;

  for (Int_t n=0; n<ksize_x; n++){
    for (Int_t m=0; m<ksize_y; m++){
      Int_t xb = i+(n-x_push);
      Int_t yb = j+(m-y_push);
      if ( (xb >= 1) && (xb <= h->GetNbinsX()) &&
           (yb >= 1) && (yb <= h->GetNbinsY()) ){
        Int_t bin = h->GetBin(xb,yb);
        Double_t k = kernel[n][m];
        if ( (k != 0.0 ) && (buf[bin] != 0.0) ) { // General version probably does not want the second condition
          content += k*buf[bin];
          error   += k*k*buf[bin]*buf[bin];
          norm    += k;
        };
      };
    };
  };

  content /= norm;
  error /= (norm*norm);
  if ( content != 0.0 ) { // General version probably does not want this condition
    h->SetBinContent(i,j,content);
    h->SetBinError(i,j,sqrt(error));
  };
};

};

delete buf;
delete ebuf;
};
[/code]

There is also no fundamental reason you can’t do something similar with TH3s.

Hi David,

Excellent!!

I have converted your C function to a TH2::Smooth member function.
This function is also visible from the TH2 context menu, such that (like for a TH1)
you can smooth a TH2 just by pointing to it in the canvas.

I did several changes in your code and I still have to add correct support for the axis ranges. The code is now in the SVN trunk.
It would be nice to have a similar implementation for a TH3 ::slight_smile:

Thanks very much for this contribution.

Rene

Rene,
I looked at the code in the archive, and have several comments

First, the kernel I pasted in yesterday is not the one I use in my own code, I end up preferring:

  Double_t kernel[ksize_x][ksize_y] = { { 0, 1, 2, 1, 0 },
                                        { 1, 2, 4, 2, 1 },
                                        { 1, 4, 8, 4, 2 },
                                        { 1, 2, 4, 2, 1 },
                                        { 0, 1, 2, 1, 0 } };

which has a slightly less anisotropic structure. In theory, the kernel probably ought to be user selectable, perhaps by passing an option, or by some other mechanism. The code I gave you will handle kernels of any odd-odd dimension. 3x3 kernels are perhaps more common in most applications.

Second: There is a (currently harmless) bug in the code I gave you. The loop limits on the copy and work loops are different.

  // Copy all the data to the temporry buffers
  for (Int_t i=0; i<=h->GetNbinsX(); i++){

// ... 

  // main work loop
  for (Int_t i=1; i<=h->GetNbinsX(); i++){

Since I never use the underflow bins, we probably don’t want to copy them. Or we do want to copy the overflow bins as well, or something.

Third: Your comment at the top of the code claims that the error bins are untouched, but they appear to be handled in the code you committed.

Fourth: You left in both of my “What to do with zero-valued bins?”-conditions that probably are not applicable to the general purpose algorithm.

            if ( (k != 0.0 ) && (buf[bin] != 0.0) ) { // General version probably does not want the second condition
//...
      if ( content != 0.0 ) { // General version probably does not want this condition
        h->SetBinContent(i,j,content);
        h->SetBinError(i,j,sqrt(error));
      };

For TH3 we generalize all 2-dim structure to 3-dim (i.e. use Double_t kernel[][][] as the kernel), and use triple loops to cover the space. I.e.:

  // Copy all the data to the temporry buffers
  for (Int_t i=1; i<=h->GetNbinsX(); i++){
    for (Int_t j=1; j<=h->GetNbinsY(); j++){
      for (Int_t k=1; k<=h->GetNbinsZ(); k++){
        //...

This will get to be slow on large histograms, but most TH3 opperations have that problem, so caveat emptor.

I don’t really have a suggestion for what 3-dim to use (the kernels I used for the TH2 version are stolen from the collected wisdom of the raster graphics community). You might just start with a uniform 3x3x3 cube.

Anyway. Thanks and good luck.

David,

OK I will add an option (eg K5a, K5b, K3a) to select between 3 standard kernels

  • an option to process or not empty bins
  • taking care of the axis range (smooth only a subset)
  • fixing the inconherent loop index with under/overflows

Rene

Hi all,

I guess that there is some fundamental reasons concerning the choice of the kernel your want to use to smooth a specific histogram, dependaing on the shape of the distribution.

I mean, to my opinion, if the histogram is flat (ok, trivial problem…:slight_smile:) the best choice is a “flat” kernel ((111)(111)(111)). Is there any hint concerning the choice of the kernel, let’s say assuming a bi-linear distribution, a quadratic one., etc ? Or at least any reference to look for ?

Thanks