Understanding TGraphAsymmErrors

leoredi · November 17, 2015, 3:17pm

Dear all,

I am calculating some efficiencies. I use one TH1F where I read from data and one TH1F on which I call FillRandom. Both TH1F have same number of bin min and MAX. I then apply the infamous

TGraphAsymmErrors *grA = new TGraphAsymmErrors() ;
grA->Divide(TH1F_data,TH1F_fillRandom,"cl=0.683 b(1,1) mode");

There is something I do not understand. If I set the number of bins too small or if I not scan from 0 I get this error:

Info in <TROOT::TEfficiency::CheckEntries>: Histograms are not consistent: passed bin content > total bin content
Error in <TROOT::TEfficiency::CheckConsistency>: passed TEfficiency objects do not have consistent bin contents
Error in <TGraphAsymmErrors::Divide>: passed histograms are not consistent

But when I use a coarser binning everything works fine.

Why does this happen?

Thanks for the help.

Example code:


void prova() {
  lhcbStyle();
  TRandom3 r(0);

  Double_t decayt_min = 0.000, decayt_MAX  = 0.001 ;
  Double_t decay_size = 0.0001 ; // ns
  Int_t decayt_nbin =  (int)((decayt_MAX - decayt_min)/decay_size) ;
  cout<<"Number of bins: "<<decayt_nbin<<endl;

  TFile *in_fileA = new TFile("") ;
  Int_t size_of_simuA = 517532 ;

  TTree* TtA = (TTree*) in_fileA->Get("B_Tuple/DecayTree") ;
  Long64_t nentriesA = TtA->GetEntries() ;
  cout<<" --- Number of events: "<<nentriesA<<" ---"<<endl ;

  TH1F *HNL_t_hA = new TH1F("HNL_t_hA","HNL_t_hA",decayt_nbin,decayt_min,decayt_MAX);
  TH1F *fa1_hA = new TH1F("fa1_hA","fa1_hA",decayt_nbin,decayt_min,decayt_MAX);

  Double_t HNL_t_dA ; TtA->SetBranchAddress("Lambda0_TRUETAU",&HNL_t_dA) ;
  Int_t HNL_BKGCAT_iA ; TtA->SetBranchAddress("Lambda0_BKGCAT",&HNL_BKGCAT_iA) ;

  TF1 *fa1A = new TF1("fa1A","[0]*TMath::Exp(1/(-[1])*x)",decayt_min,decayt_MAX);
  fa1A->SetParameter(0,1.) ;
  fa1A->SetParameter(1,0.01) ;
  
  for (Long64_t i=0; i<nentriesA; i++) {
    TtA->GetEntry(i);
    if (HNL_BKGCAT_iA > 20) continue ; // filter for non HNL
    HNL_t_hA->Fill(HNL_t_dA) ;
  }
  
  fa1_hA->FillRandom("fa1A",517532);
  

  TGraphAsymmErrors *grA = new TGraphAsymmErrors() ;
  grA->Divide(HNL_t_hA,fa1_hA,"cl=0.683 b(1,1) mode");

  grA->Draw("ap");
}

Axel · November 18, 2015, 10:21am

Hi,

FillRandom fills fa1_hA following the distribution of fa1A. TGraphAsymmErrors::Divide(pass, all) in binomial mode requires that for each bin, the entries in pass are less or equal those in all. But calling this function, the agreement between you and ROOT is that pass is a subset of all. If you fill too many entries into pass (by reducing the bin width or increasing the number of entries) then that’s not the case anymore. Even if you adjust these parameters you can end up with cases where pass <= all is violated, because you use FillRandom() which creates a random distribution.

In general, creating an efficiency (or pass fraction) from a sampled distribution is a fairly fragile procedure. Are you sure you’re really looking for a binomial uncertainty here?

Cheers, Axel

leoredi · November 18, 2015, 1:46pm

Hi,

Thanks for the answer, no technically I do not need a binomial uncertainty here because my values are far from 1 or 0.
What would you suggest could be the best method?

Axel · November 18, 2015, 1:54pm

Hi,

You could simply use TH1::Divide().

Cheers, Axel.

leoredi · November 18, 2015, 2:12pm

Hi,
This is what I was doing in the past but with no luck in treating the error correctly.
So now following your suggestion I do:

void prova() {
  Double_t decayt_min = 0.000, decayt_MAX  = 10 ;
  Double_t decay_size = 0.001 ; // ns
  Int_t decayt_nbin =  (int)((decayt_MAX - decayt_min)/decay_size) ;
  cout<<"Number of bins: "<<decayt_nbin<<endl;
  TFile *in_file = new TFile("") ;
  Int_t size_of_simu = 517532 ;

  TTree* Tt = (TTree*) in_file->Get("B_Tuple/DecayTree") ;
  Long64_t nentries = Tt->GetEntries() ;
  cout<<" --- Number of events: "<<nentries<<" ---"<<endl ;

  TH1F *HNL_t_h = new TH1F("HNL_t_h","HNL_t_h",decayt_nbin,decayt_min,decayt_MAX);
  Double_t HNL_t_d ; Tt->SetBranchAddress("Lambda0_TRUETAU",&HNL_t_d) ;
  Int_t HNL_BKGCAT_i ; Tt->SetBranchAddress("Lambda0_BKGCAT",&HNL_BKGCAT_i) ;

  TF1 *fa1 = new TF1("fa1","[0]*TMath::Exp(1/(-[1])*x)",decayt_min,decayt_MAX);
  fa1->SetParameter(0,1.) ;
  fa1->SetParameter(1,0.01) ;

  for (Long64_t i=0; i<nentries; i++) {
    Tt->GetEntry(i);
    if (HNL_BKGCAT_i > 20) continue ; // filter for non HNL
    HNL_t_h->Fill(HNL_t_d) ;
  }
  HNL_t_h->Divide(fa1);
  HNL_t_h->SetMarkerColor(kBlue-7);
  HNL_t_h->Draw("ep");
}

This gives me a plot where the error on the y axis are way too small to be real, am I doing something wrong?

moneta · November 19, 2015, 12:53pm

Hi,

You can use TGraphAsymmErrors::Divide(h1, h2) in the non-binomial case (Poisson ratio), using the option “pois”.
See root.cern.ch/doc/master/classTG … 4046be8c0b

If you are using weighted histograms, you might need to use the latest ROOT version to get the right uncertainties

Lorenzo

leoredi · November 20, 2015, 11:01am

Hi,
OK at the end I think I will adopt the solution of dividing a histogram with a function:

TH1F *HNL_t_hA = new TH1F("HNL_t_hA","HNL_t_hA",decayt_nbin,decayt_min,decayt_MAX);
  HNL_t_hA->Sumw2();
 
  ...
  
TF1 *fa1A = new TF1("fa1A","[0]*TMath::Exp(1/(-[1])*x)",decayt_min,decayt_MAX);
  fa1A->SetParameter(0,1.); fa1A->SetParameter(1,0.01);
 
 ...

HNL_t_hA->Divide(fa1A,size_of_simuA) ;

I think this procedure should be the most efficient way of performing this efficiency calculation, do you agree?
Thanks