Difference In Extracting Histogram With Variable Compared To String Literal - Bug?

JackLindon · January 12, 2021, 11:51pm

Hi,

I have a file with a bunch of histograms named “Varied0” “Varied1” “Varied2” etc etc.

I am trying to access them one by one in a loop as such

for (int i=0;i<100;i++){

    std::string name{"Varied"};
    name+= std::to_string(i);

    TFile *dataFile = new TFile("file.root");                                                                                                                                                    
    TH1F* dataHist = new TH1F(name.c_str(),name.c_str(), 100, 0, 4);                                                                                                                                        
    dataHist = (TH1F*) dataFile->Get(name.c_str());                                                                                                                                                         
    dataHist->Draw();                                                                                                                                                                                       
}

However, doing this all the histograms are just the blank histogram initialized before Get(name.c_str()) is called, with the correct name for the title (e.g. Varied0)

I instead tried changing std::to_string(i) to std::to_string(0) just to test, and they are still blank.

However, if I replace dataHist = (TH1F*) dataFile->Get(name.c_str());
with dataHist = (TH1F*) dataFile->Get(“Varied0”);

Which should be identical, this time I do get the histogram Varied0.

pcanal · January 13, 2021, 12:16am

Why not use:

for (int i=0;i<100;i++){

    std::string name{"Varied"};
    name+= std::to_string(i);

    TFile *dataFile = new TFile("file.root");                                                                                                                                                    
    TH1F* dataHist = (TH1F*) dataFile->Get(name.c_str());                                                                                                                                                         
    dataHist->Draw();                                                                                                                                                                                       
}

what is the purpose of creating a TH1F object to then (sorta) lose access to it? (In the code in the original post, the file ends up owning 2 histogram of the same name (an empty one just created and the one whose data is on disk).

JackLindon · January 13, 2021, 12:21am

pcanal:

    std::string name{"Varied"};
    name+= std::to_string(i);

    TFile *dataFile = new TFile("file.root");                                                                                                                                                    
    TH1F* dataHist = (TH1F*) dataFile->Get(name.c_str());                                                                                                                                                         
    dataHist->Draw();

Hi,

Thanks for the response, your solution works, but my question is more about why does ROOT treat name.c_str(), where name.c_str()==“Varied0” and “Varied0” differently in this case?

pcanal · January 13, 2021, 12:46am

I am not sure. Apriori it should not, I would need a complete reproducer to investigate.

JackLindon · January 13, 2021, 1:08am

JackLindon:

for (int i=0;i<100;i++){

    std::string name{"Varied"};
    name+= std::to_string(i);

    TFile *dataFile = new TFile("file.root");                                                                                                                                                    
    TH1F* dataHist = new TH1F(name.c_str(),name.c_str(), 100, 0, 4);                                                                                                                                        
    dataHist = (TH1F*) dataFile->Get(name.c_str());                                                                                                                                                         
    dataHist->Draw();                                                                                                                                                                                       
}

Hi, I have produced a minimum working example.

Attached (file.root (35.4 KB)) is a root file with 100 histograms named “Varied0”, “Varied1” etc up to “Varied99”. Each histogram is just a simple histogram with 100 bins between 0 and 100, with a single entry which is the same value as the number of the histogram (e.g. Varied14 has a single entry in the 14th bin).

In addition attached (stringLiteralTest.cpp (619 Bytes)) is a cpp file that simply does what has been discussed, attempting to loop over each of these histograms and then to test whether or not it has successfully it simply saves a .png of each histogram with the filename “Varied0.png”, “Varied1.png” etc.

However, the output of this (when running with root -l stringLiteralTest.cpp in root v6.22.06) is just 100 pngs of completely empty histograms. However, if line 22

      dataHist = (TH1F*) dataFile->Get(name.c_str());

is changed to

      dataHist = (TH1F*) dataFile->Get("Varied99");

(or any other “VariedX” where X is 0 to 99), the output is instead 100 .png copies of the Varied99 histogram as expected. However this does not seem to make much sense since if line 9

      name+= std::to_string(i);

Is changed to

      name+= std::to_string(99);

(or any other value between 0 and 99), the output is still 100 completely blank histograms, even though now line 22 should be getting the exact same input as name.c_str()==“Varied99”

pcanal · January 13, 2021, 7:01pm

Actually for me, it 99 copies of the good Varied99 and 1 empty Varied99 (the one actually named Varied99.png).

Is that different for you?

The result I see is the expected result. The newly create histogram is attached to TFile (by default, this is changeable) and TFile::Get returns the histogram with the requested name giving priority to the one already in memory.

Another source of confusion I missed earlier is that the code reads:

  for (int i=0;i<100;i++){
      ....
      TFile *dataFile = new TFile("file.root");

which as the net result of opening the same physical files 100 times (and since there is no explicit deletion, they will be deleted by the ROOT infrastructure only at the end of the file).
This means in particular that the 100 iteration are actually completely independent.
For example when you have:

  for (int i=0;i<100;i++){
      std::string name{"Varied"};
      name+= std::to_string(i);
      TFile *dataFile = new TFile("file.root");
      TH1F* dataHist0 = new TH1F(name.c_str(),name.c_str(), 100, 0, 4);
      TH1F* dataHist = (TH1F*) dataFile->Get("Varied0");

at the first iteration, the TFile object has the empty histogram Varied0 when calling Get (and thus Varied0.png should show an empty histogram)
at the second iteration, the (new for this iteration) TFile object has ONLY the empty histogram named Varied1 when calling Get (and thus Varied1.png should go the content of the read/intended Varied0.png).

So in fine, what you should be using is:

TFile *dataFile = new TFile("file.root");                                                                                                                                                    
for (int i=0;i<100;i++){
    std::string name{"Varied"};
    name+= std::to_string(i);

    TH1F* dataHist = (TH1F*) dataFile->Get(name.c_str());                                                                                                                                                         
    dataHist->Draw();                                                                                                                                                                                       
}

so that you do not waste resources reopening the same file multiple time and do not waste resource create an usued emty histogram per loop.

Cheers,
Philippe.

JackLindon · January 13, 2021, 9:37pm

Hi,

Actually for me, it 99 copies of the good Varied99 and 1 empty Varied99 (the one actually named Varied99.png).

Yes this is the same for me I did not notice the empty one.

The result I see is the expected result. The newly create histogram is attached to TFile (by default, this is changeable) and TFile::Get returns the histogram with the requested name giving priority to the one already in memory.

Yes this is the expected result for the code with the line

      dataHist = (TH1F*) dataFile->Get("Varied99");

but it is not the expected result for the code with the line

      dataHist = (TH1F*) dataFile->Get(name.c_str());

Even though they should be identical. i.e. the two codes below:

#include "TH1.h"
#include <string>
#include "TFile.h"


int stringLiteralTest(){
  for (int i=0;i<100;i++){    
      std::string name{"Varied"};
      name+= std::to_string(99);

      TFile *dataFile = new TFile("file.root");
      TH1F* dataHist = new TH1F(name.c_str(),name.c_str(), 100, 0, 4);
      dataHist = (TH1F*) dataFile->Get(name.c_str());

      std::string fileName=name;
      fileName+=".png";
      TCanvas *c1 = new TCanvas("c1","transparent pad",200,10,600,600);
      dataHist->Draw("");
      gPad->SetLogy();
      gStyle->SetOptFit(1);
      c1->SaveAs(fileName.c_str());
  }
  return 0;
}

And

#include "TH1.h"
#include <string>
#include "TFile.h"


int stringLiteralTest(){
  for (int i=0;i<100;i++){    
      std::string name{"Varied"};
      name+= std::to_string(99);
      
      TFile *dataFile = new TFile("file.root");
      TH1F* dataHist = new TH1F(name.c_str(),name.c_str(), 100, 0, 4);
      dataHist = (TH1F*) dataFile->Get("Varied99");

      std::string fileName=name;
      fileName+=".png";
      TCanvas *c1 = new TCanvas("c1","transparent pad",200,10,600,600);
      dataHist->Draw("");
      gPad->SetLogy();
      gStyle->SetOptFit(1);
      c1->SaveAs(fileName.c_str());
  }
  return 0;
}

Should both give the exact same result, the only difference is in one case dataFile->Get(name.c_str()); is passed name.c_str() where name.c_str()==“Varied99”, while in the other case it is passed with a string literal as dataFile->Get(“Varied99”);

Both of these codes should give the exact same result no? But they do not, the second case prints 99 copies of Varied99 and 1 empty Varied99, whereas the first case prints 100 empty Varied99.

I realise this is not the correct way to loop over reading histograms, regardless of this the behaviour of these two programs should be identical, no?

Wile_E_Coyote · January 13, 2021, 10:07pm

First, as Philippe writes, you are opening the “dataFile” 100 times in the “for” loop so, move the relevant line to a place before the “for” loop.

Then, as Philippe writes, the new TH1F(name.c_str(), ...) call creates a completely new “dataFile resident” histogram, which immediately “hides” / “masks” / “covers” any existing object with the same “name” which is possibly already present in the “dataFile” (and thus the following dataFile->Get(name.c_str()) call will return this newly created histogram, regardless if it has already been present in the “dataFile” or not).

So, use the last code snippet given by Philippe in his last post above, right after “what you should be using is”.

pcanal · January 13, 2021, 10:07pm

I can not reproduce this per se. Since all the file names are the same (all Varied99.png), all I can see is the file produced by the very last iterations and it is (as expected) an empty histogram.

And indeed, I see no mechanism for this last 2 code snippets to behave differently.

JackLindon · January 13, 2021, 10:47pm

Hi,

I figured out what the issue was, it is because in the case were name is set to be “Varied99” in every case
TH1F* dataHist = new TH1F(name.c_str(),name.c_str(), 100, 0, 4);
is initialized to the same name for every histogram, whereas where just dataFile->Get(“Varied99”); is named that and std::to_string(i) is kept the same the histograms have unique names

Cheers,
Jack

system · January 27, 2021, 10:47pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.