Save data points (from a histogram) in a txt file

Dear all,
I have made a TH2F histogram, let’s call it h5. The data used to plot this histogram is a very large (in terms of data quantity) .root file.

So, i would like to ask you this: How can I print the x,y coordinates and the counts of this specific histogram I made in a .txt or .dat file?

I know that there is the command h5->Print(“all”), which prints them on the terminal. But this way, I can’t use them for further analysis.

Thank you very much in advance.

ROOT Version: Not Provided
Platform: Not Provided
Compiler: Not Provided


Hello,

The histo is now stored in a ROOT file?

If yes, you can have a ROOT macro that loads the histo from the file and invokes Print, and when you run that macro (e.g. histo.C) you redirect the output to a file with root -q test.C > output.txt.

also:

root [0] .> hpx.dat
hpx->Print("all")
.>
root [3] .q
1 Like

Thank you very much for the fast response to my question!

First of all, I did save the histogram in a .root file. And your method truly worked. It really redirected the output from the terminal to a .txt file!

Beyond that, it would be much appreciated if you could help me shed some more light into this. TO be more specific, could I present this output in a more “classy” way in the .txt file? Instead of using the h5->Print(“all”);, is there any other command I could use like cout or something like that to present them in a more organized way?

Thank you in advance for your insight into this matter
stekoul

Thank you very much for the feedback!

and also:

1 Like

A really useful thread! Thank you so much!

Cheers,
stekoul

Hey again!

After studying carefully the aforementioned thread, I would like to ask you a question mainly based on this remark I made yesterday about presenting the data in a more “classy” way. First of all, I have succeeded in redirecting the output from the terminal to a file as discussed yesterday for a TH2 histo. Afrterwards, for a TH1 histogram, I have successfully written in an input file the bin centers and the bin content of each data point (as you can see on the uploaded macro).
histogram_txt.C (927 Bytes)

So my question is this: In the case of a TH2 histogram, how can I make a similar type of macro? I suppose that I have to make a double “for” loop in order to extract the data points (x,y) and the counts for each bin. But I’m really in a loss on how to actually write this macro.

This is really my last question in this thread. I’m sorry for the inconvenience.
stekoul

yes, something like that.

You can try with:

for(int i=1; i<=h2->GetNbinsX(); ++i) {
  for(int j=1; j<=h2->GetNbinsY(); ++j) {
    cout << h2->GetBinContent(i,j) << " ";
  }
  cout << endl;
}

You can do something similar with h2->GetXaxis()->GetBinCenter(i) and h2->GetYaxis()->GetBinCenter(j) or so.

This indeed yields effectively the counts for each bin. But how about printing the set of coordinates for each bin respectively?

It’s written below.

1 Like

I believe that I didn’t phrase my question correctly. I meant to say the x,y coordinates of each point and not the values of binx and biny. In the following simple macro i have made a 2D histogram filled with random numbers.
hIHisto.C (432 Bytes)

For instance, with the command h2->Print(“all”); we print binc and the coordinates x,y.

So, I want to do the same by printing these values into an ascii file (.dat , .txt etc.) in proper format by also using “for” loops instead of using the aforementioned command.

The functions I suggested to you in the post above give the coordinates x,y of each point, not the values of binx and biny. Please try it out.

PrintAll Gives:

 fSumw[0][0]=0, x=-5.1, y=-5.1
 fSumw[1][0]=0, x=-4.9, y=-5.1
 fSumw[2][0]=0, x=-4.7, y=-5.1
 fSumw[3][0]=0, x=-4.5, y=-5.1
 fSumw[4][0]=0, x=-4.3, y=-5.1
 fSumw[5][0]=0, x=-4.1, y=-5.1
 fSumw[6][0]=0, x=-3.9, y=-5.1
 fSumw[7][0]=0, x=-3.7, y=-5.1
 fSumw[8][0]=0, x=-3.5, y=-5.1
 fSumw[9][0]=0, x=-3.3, y=-5.1

Where as GetBinCenter or GetBinLowEdge return things like:

-4.9 -4.9 0 
-4.9 -4.7 0 
-4.9 -4.5 0 
-4.9 -4.3 0 
-4.9 -4.1 0 
-4.9 -3.9 0 
-4.9 -3.7 0 
-4.9 -3.5 0 
-4.9 -3.3 0 
-4.9 -3.1 0 
-4.9 -2.9 0 
-4.9 -2.7 0 
-4.9 -2.5 0 
-4.9 -2.3 0 
-4.9 -2.1 0 
-4.9 -1.9 0 
-4.9 -1.7 0

binx, and biny would be just i and j.
GetBinCenter gets the x/y coordinates of those indices.

  for(int i=1; i<=h2->GetNbinsX(); ++i) {
    for(int j=1; j<=h2->GetNbinsY(); ++j) {
      cout << h2->GetXaxis()->GetBinCenter(i) << " " << h2->GetYaxis()->GetBinCenter(j) << " " << h2->GetBinContent(i,j) << endl;
    }
    cout << endl;
  }
1 Like

But the GetBinCenter, returns the center of the bin and not the actual coordinates.
E.g let’s say that i have filled a histogram with these values:

TH2 *h5 = new TH2D("h5", "2D Histo", 10,0,10, 10,0,10);

h5->Fill(4.9,2.35);
h5->Fill(1.98,3.64);
h5->Fill(1.37,6.66);
h5->Fill(5.2,5.55);
h5->Fill(8.96,7.74);
h5->Fill(0.8,0.3);

Now, if i run this loop:

for(int i=1; i<=h55->GetNbinsX(); ++i) {
for(int j=1; j<=h55->GetNbinsY(); ++j) {
cout << h55->GetXaxis()->GetBinCenter(i) << " " << h55->GetYaxis()->GetBinCenter(j) << " " << h55->GetBinContent(i,j) << endl;
}
cout << endl;
}

I get this (an excerpt of the result):
5.5 0.5 0
5.5 1.5 0
5.5 2.5 0
5.5 3.5 0
5.5 4.5 0
5.5 5.5 1
5.5 6.5 0
5.5 7.5 0
5.5 8.5 0
5.5 9.5 0

And it doesn’t return me for example the (x,y) = (5.2,5.55) that I wanted.
Sorry for keep asking questions but I’m a newbie… I’m very grateful for giving me all this useful feedback! It is really helpful.

Best regards,
stekoul

Ok, I understand now your question.

This probably comes from the fact that you are drawing the 2D histogram as a scatter plot.
See ROOT: THistPainter Class Reference

This gives you the impression that the points inside it are individual points separate from the bin center, ie the original points you added at first. But this is not really the case. The bin content is just a value, and then ROOT paints individual points randomly scatter around it, see:

For each cell (i,j) a number of points proportional to the cell content is drawn. A maximum of kNMAX points per cell is drawn. If the maximum is above kNMAX contents are normalized to kNMAX (kNMAX=2000)

So some questions for you:

  • Do you want to get the coordinate of each of this drawn points, even if they are totally unrelated with the coordinates from the original points you filled? See ROOT: hist/histpainter/src/THistPainter.cxx Source File
  • Or do you rather want to draw your data using a TGraph class, where you can have many many points, without any binning, and you can later retrieve the exact coordinate of each of them?

Ohh I really didn’t know that!

I will give you an exact overview of what i want to do: I have a .root file which contains a large amount of experimental data. Using this .root file let’s just say that I have made a TH2 histogram (“colz”) correlating two parameters of the file under certain conditions.

So, for a next step, I want to isolate these events that are now depicted in this histogram (meaning that I want their coordinates, or to be more precise, their precise values of the parameters) in order to make a further analysis and possible 1D distributions of them.

This is the case that I am dealing with! All of the above excerpts were just parts of some macros that I do as exercises in order to achieve the big picture; Which is what I have written now in this reply.

Once you learn about the correlation between the two parameters, write it down or store it into a variable in your script, for example you want to analyze specifically the area where x > 2 and x <= 3.5.

With that, you need to traverse again your TTree and apply your filter. You cannot do it directly on the TH2F because your data points are lost (binned). So start again from the first entry in your TTree until the last one. On every iteration, check if x is outside the desired range. If that’s the case, continue with the next iteration instead of analyzing those data.