Alphabetical sorting of alphanumeric labels of 2-D histogram

Dear ROOT Talk,

I seem to be having problems alphabetically sorting alphanumeric labels in 2-D histograms.
I can get it to work fine for 1-D histograms, but not for 2-D histograms.
I’m using ROOT version 5.34.10.

Working example for 1-D histogram case:

void test1D()
{
    char x[8];

    TTree *t = new TTree();
    t->Branch("x", x, "x[8]/C");

    strcpy(x, "b");
    t->Fill();

    strcpy(x, "b");
    t->Fill();

    strcpy(x, "a");
    t->Fill();

    t->Draw("x>>hist", "", "goff");
    TH1F *h = (TH1F*)gDirectory->Get("hist");
    // Want to do alphabetical sorting of bin labels here.
    h->LabelsOption("a", "X"); // This works.
    h->Draw();
    gPad->SaveAs("test1.eps");

    delete t;
    return;
}

Output: test1.eps (5.6 KB)

Not working example for 2-D case:

void test2D()
{
    char x[8], y[8];

    TTree *t = new TTree();
    t->Branch("x", x, "x[8]/C");
    t->Branch("y", y, "y[8]/C");

    strcpy(x, "a");
    strcpy(y, "a");
    t->Fill();

    strcpy(x, "c");
    strcpy(y, "c");
    t->Fill();

    strcpy(x, "b");
    strcpy(y, "b");
    t->Fill();

    t->Draw("y:x>>hist", "", "goff");
    TH2F *h = (TH2F*)gDirectory->Get("hist");
    // Want to do alphabetical sorting of bin labels here.
    h->LabelsOption("a", "X"); // Neither of these two work when uncommented.
    //h->LabelsOption("a", "Y");
    h->Draw("colz");
    gPad->SaveAs("test2.eps");

    delete t;
    return;
}

Output: test2.eps (7.36 KB)

Thank you,
Mich

Next, I tested a similar case for alphabetically sorting labels of a 2-D histogram without filling it from a TTree object.
It works fine.

void test2D_b()
{
    const int nx = 3;
    char *xlabel[nx] = {"a", "c", "b"};
    char *ylabel[nx] = {"c", "a", "b"};

    TH2F *h = new TH2F("h", "h", nx, 0, nx, nx, 0, nx);
    for (int i = 0; i < 3; ++i) {
        h->GetXaxis()->SetBinLabel(i+1, xlabel[i]);
        h->GetYaxis()->SetBinLabel(i+1, ylabel[i]);
    }
    h->Fill(0.5, 0.5); // Fill entry for bin ("a", "c");
    h->Fill(1.5, 1.5); // Fill entry for bin ("c", "a");
    h->Fill(2.5, 2.5); // Fill entry for bin ("b", "b");
    // Want to do alphabetical sorting of bin labels here.
    h->LabelsOption("a", "X"); // Both of these work. 
    h->LabelsOption("a", "Y");
    h->Draw("colz");
    gPad->SaveAs("test2_b.eps");
}

Output:test2_b.eps (7.34 KB)

So I suppose there is something I am doing wrong with the fill from the tree.
I would still prefer to use the tree because it lets me make the plot without knowing how many bins I need a priori.
Does anyone know what I did wrong in the first post?

Thanks,
Mich

Can you, please, attach a sample root file with a tree demonstrating your problem, if it’s not too big?
Or if your data is private - just give an example of ttree (branches, names) you have to reproduce your problem?

Here is my sample ROOT file with the tree from my first post.
test.root (5.14 KB)
Here is the code to reproduce my problem with this ROOT file.

void test2D_c()
{
    TFile *f = new TFile("test.root", "read");
    TTree *t = dynamic_cast<TTree*>(f->Get("t"));
    t->Draw("y:x>>hist", "", "goff");
    TH2F *h = (TH2F*)gDirectory->Get("hist");
    // Want to do alphabetical sorting of bin labels here.
    //h->LabelsOption("a", "X"); // Neither of these two work when uncommented.
    h->LabelsOption("a", "Y");
    h->Draw("colz");
    gPad->SaveAs("test2_c.eps");
}

Basically my problem is that I can reorder the bin labels successfully, but I cannot get the data to correctly reflect the reordered labels.

Aha, I see now. Well, with your example it’s better to sort only along one axis to demonstrate the problem, otherwise (if you sort both) it looks “correct” (a,a), (b,b), (c,c) have the same contents as they expected.
I’ll have a look at LabelsOption.

P.S. If you print bin contents after say h->LabelsOption(“a”, “X”) … they are sorted correctly:

b-0-3-0
c-0-0-3
a-3-0-0
–a-b-c

Ok, a bit of black magic from me and it works:

[code]void tree()
{
TFile *f = new TFile(“test.root”, “read”);
TTree t = dynamic_cast<TTree>(f->Get(“t”));
t->Draw(“y:x>>hist”, “”, “goff”);
TH2F h = (TH2F)gDirectory->Get(“hist”);
// Want to do alphabetical sorting of bin labels here.

h->BufferEmpty(1);
h->LabelsOption("a", "X"); // Neither of these two work when uncommented.
//h->LabelsOption("a", "Y");
h->Draw("colz");
gPad->SaveAs("test2_c.eps");

}[/code]
test2_c.eps (6.88 KB)

Thank you for looking into this.

Yes you are right, I should have only sorted along one axis in the example code for clarity.
Sorry for any confusions.

As you say I looked at the actual bin contents of the histogram after sorting the labels and they were correctly sorted along with the position of the labels, even without doing “h->BufferEmpty(1)”.
So I suppose this extra code is needed for specifically the plot to show-up correctly.

Can you explain what you did?
Also there is something fishy about the histogram this produced.
The total number of entries clearly does not match with what is displayed in the statistics box on the upper right hand side of the plot.

If you print bin contents after the histogram was actually painted in a canvas (this happens after the macro was executed and the control returns back to gSystem/gApplication), you’ll see - it changes: instead of 3 you’ll have 1 (is it normalised somehow??) and they are not sorted anymore.

THistPainter checks if a histogram has a buffer and re-fills the histogram from this buffer, that’s why the sorted data is lost. BufferEmpty(1) deletes this buffer. I’m not sure, probably, LabelsOption should also sort
a buffer?

[quote]
The total number of entries clearly does not match with what is displayed in the statistics box on the upper right hand side of the plot.[/quote]

What do you mean? Do you have more than 3 entries in your hist?

I mean that in the output plot you provide “test2_c.eps”, the stat box says that there are a total of 3 entries in the histogram. Looking at the scale of the colored axis, there is something more like a total of 9 entries.
In all the other plots up to the one you posted (test2_c.eps), I couldn’t get the data sorted according to how the labels were being sorted, but the total number of entries seemed to match with what was in the stat box.
Now with your fix, the data is being sorted with the lables, but the total normalization is not showing correctly.

I’m getting a bit confused here.
I thought I had filled my initial tree (called “t” inside the test.root file above) with 3 entries.
Doing the following indeed shows me that there are only 3 entries in the tree (as there should be): one for each pairing of (a, a), (b, b), (c, c).

********* Begin snippet from BASH ********
mich@server$ root test.root
root [0]
Attaching file test.root as _file0…
root [1] _file0->ls()
TFile** test.root
TFile* test.root
KEY: TTree t;1 t
root [2] t->GetEntries()
(const Long64_t)3
root [3] t->Scan()


  • Row * x * y *

  •    0 *         a *         a *
    
  •    1 *         c *         c *
    
  •    2 *         b *         b *
    

(Long64_t)3
root [4]
************* End snippet from BASH *********

I wonder why the histogram you plotted shows that there are 3 entries per filled bin instead of 1?

[quote=“mich”]I mean that in the output plot you provide “test2_c.eps”, the stat box says that there are a total of 3 entries in the histogram. Looking at the scale of the colored axis, there is something more like a total of 9 entries.
[/quote]
Your original histogram/plot has 3 entries (according to the stat box), and my histogram with sorted labels has 3 entires. Is it wrong? We have to understand though, why bin content is 3, not 1, but still - 3 bins contain 3 each.

Me too - BufferEmpty accepts an integer parameter: Int_t action, a comment in a code:

[quote]// action = -1 histogram is reset and refilled from the buffer (called by THistPainter::Paint)
// action = 0 histogram is reset and filled from the buffer. When the histogram is filled from the
// buffer the value fBuffer[0] is set to a negative number (= - number of entries)
// When calling with action == 0 the histogram is NOT refilled when fBuffer[0] is < 0
// While when calling with action = -1 the histogram is reset and ALWAYS refilled independently if
// the histogram was filled before. This is needed when drawing the histogram
//
// action = 1 histogram is filled and buffer is deleted
// The buffer is automatically deleted when filling the histogram and the entries is
// larger than the buffer size[/quote]

So why GetBinContent(1,1) returns different values before/after you draw the same histogram WITHOUT MODIFYING THIS HISTOGRAM - this is beyond my comprehension. I think, it’s either a bug, or a problem with an interface/implementation/documentation.

No, I think it’s one entry per bin and bin content is 3, we have to understand why a bin content is 3 instead of 1.

What you can do is this:

TFile *f = new TFile("test.root", "read"); TTree *t = dynamic_cast<TTree*>(f->Get("t")); t->Draw("y:x>>hist", "", "goff"); TH2F *h = (TH2F*)gDirectory->Get("hist"); // Want to do alphabetical sorting of bin labels here. h->BufferEmpty(-1); h->BufferEmpty(1); h->LabelsOption("a", "Y"); h->Draw("colz");

Now at least you do not have this confusing 3, but how reliable/interesting these results and how to explain this behaviour - I do not know. I suggest for test you use something more interesting, like 2 (a,a) , 3 (b,b), 1 (c,c) and we should have a look at a plot this macro will generate.
For me - this is a nonsense: GetBinContent(1,1) for the same hist SHOULD ALWAYS give me the same value before/after the call hist->Draw(), and if not - this MUST BE documented and explained in details (I’ve personally never seen any explanation, but I do not work histograms/trees, so may be, somebody else can explain this magic).

OK, I created a new tree in this root file: test2.root (5.17 KB).
I filled it with what you suggest and confirmed its content by doing the following:
******* Begin snippet from BASH *********
mich@server$ root test2.root
root [0]
Attaching file test2.root as _file0…
root [1] _file0->ls()
TFile** test2.root
TFile* test2.root
KEY: TTree t;1 t
root [2] t->GetEntries()
(const Long64_t)6
root [3] t->Scan()


  • Row * x * y *

  •    0 *         a *         a *
    
  •    1 *         a *         a *
    
  •    2 *         c *         c *
    
  •    3 *         b *         b *
    
  •    4 *         b *         b *
    
  •    5 *         b *         b *
    

(Long64_t)6
root [4]
****** End snippet from BASH *******

And the output/plot look good!
(I moved the stat box for easier viewing.)
test2D_d.eps (6.89 KB)

I also checked the content of the histogram by printing it out each step of the way during the sorting process in your code above.
It gave me:

mich@server$ root test2D_d.C 
root [0] 
Processing test2D_d.C...
Print hist content after gDirectory->Get("hist"):
yBin
3 | 0 0 9 
2 | 0 3 0 
1 | 6 0 0 
  -------
    1 2 3 xBin

Print hist content after BufferEmpty(-1):
yBin
3 | 0 0 3 
2 | 0 1 0 
1 | 2 0 0 
  -------
    1 2 3 xBin

Print hist content after BufferEmpty(1):
yBin
3 | 0 0 3 
2 | 0 1 0 
1 | 2 0 0 
  -------
    1 2 3 xBin

Print hist content after LabelsOption("a", "Y"):
yBin
3 | 0 1 0 
2 | 0 0 3 
1 | 2 0 0 
  -------
    1 2 3 xBin

Info in <TCanvas::MakeDefCanvas>:  created default TCanvas with name c1
Print hist content after Draw("colz"):
yBin
3 | 0 1 0 
2 | 0 0 3 
1 | 2 0 0 
  -------
    1 2 3 xBin

root [1]

Again there is this strange scaling of the data by a factor of 3, but only right after the hist is filled from the tree.
But as far as I’m concerned, my problem is solved, so thanks so much!