Problem with Tree::Project() to a 2D histogram

Dear experts,

I would like to extract a 2D histogram from a tree using a TTree::Project() function and got a problem.
It seems that the projected histogram has much less entries compared to what I would get using a TTree::Draw() function.

## go to the sample directory, this link should allow you to access it
## https://cernbox.cern.ch/index.php/s/9Z2gt2QKyIFTz5v
cd /eos/user/s/sandrean/ntuples/stop1L/background/mc16d_ttZ

python
from ROOT import *
from array import array

tree = TChain("mc16d_ttZ_Nom")
tree.Add("*.root")
h_temp = TH2D("temp", "", 30, 0, 600e3, 3, array('d',[250e3, 350e3, 450e3, 550e3]))
tree.Project("temp", "met:mt", "(stxe_trigger==1 && lep_pt[0]>25e3 && n_jet>3 && jet_pt[0]>25e3 && jet_pt[1]>25e3 && jet_pt[2]>25e3 && jet_pt[3]>25e3 && met>230e3 && dphi_jet0_ptmiss>0.4 && dphi_jet1_ptmiss>0.4 && mt>30e3 && (mT2tauLooseTau_GeV>80||mT2tauLooseTau_GeV<0)&&1) * (1) * weight * xs_weight * sf_total * weight_sherpa22_njets * 150000")
h_temp.Draw()

Screenshot%20from%202018-10-26%2016-47-33

Compare this if I just do:

tree.Draw("met:mt", "(stxe_trigger==1 && lep_pt[0]>25e3 && n_jet>3 && jet_pt[0]>25e3 && jet_pt[1]>25e3 && jet_pt[2]>25e3 && jet_pt[3]>25e3 && met>230e3 && dphi_jet0_ptmiss>0.4 && dphi_jet1_ptmiss>0.4 && mt>30e3 && (mT2tauLooseTau_GeV>80||mT2tauLooseTau_GeV<0)&&1) * (1) * weight * xs_weight * sf_total * weight_sherpa22_njets * 150000")

Screenshot%20from%202018-10-26%2016-49-36

Clearly, the second plot has more data points in it, despite having the same cut and weights!
Hoping to get some clarity from the forum.

Thanks!

Best,
Yosse


ROOT Version: 6.10
Platform: lxplus
Compiler: Not Provided


When you draw the resulting histogram as a scatter (Draw() without option) the points are just random markers filling the bins to give an idea of greyscale (the number of marker per bin is proportional to the bin content) . But the data are binned, unlike the second plot where the points are plotted at the exact position of met and mt. I would suggest you draw the histogram with some more meaningful option. for instance COLZ:

h_temp.Draw("COLZ")

Hi Olivier,

Thank you for the reply. I see what you mean… I have tried drawing with “COLZ” and with the same binnings and indeed they come out the same. I just realized I may have presented the problem inaccurately.

So… the actual problem is when I projected the TH2D that I got from tree.Project() into a TH1D histogram using ProjectionX, this TH1D histogram has much fewer entries compared to a TH1D if I directly extracted from tree.Project() with some additional cuts that should mimic the ProjectionX bin.

h_temp = TH2D("temp", "", 30, 0, 600e3, 3, array('d',[250e3, 350e3, 450e3, 550e3]))
tree.Project("temp", "met:mt", "(stxe_trigger==1 && lep_pt[0]>25e3 && n_jet>3 && jet_pt[0]>25e3 && jet_pt[1]>25e3 && jet_pt[2]>25e3 && jet_pt[3]>25e3 && met>230e3 && dphi_jet0_ptmiss>0.4 && dphi_jet1_ptmiss>0.4 && mt>30e3 && (mT2tauLooseTau_GeV>80||mT2tauLooseTau_GeV<0)&&1) * (1) * weight * xs_weight * sf_total * weight_sherpa22_njets * 150000")
h_proj = TH1D("proj", "", 30, 0, 600e3)
h_proj = h_temp.ProjectionX("proj", 1, 1, "e")
h_proj.Draw("e1")

Screenshot%20from%202018-10-26%2020-21-02

Compare it with this when I save the TH1D directly from the tree with some additional cut that is the projection biny used previously.
Notice the additional cut “met>250e3 && met<350e3”

h_tree = TH1D("tree", "", 30, 0, 600e3)
tree.Project("tree", "mt", "(stxe_trigger==1 && lep_pt[0]>25e3 && n_jet>3 && jet_pt[0]>25e3 && jet_pt[1]>25e3 && jet_pt[2]>25e3 && jet_pt[3]>25e3 && met>230e3 && dphi_jet0_ptmiss>0.4 && dphi_jet1_ptmiss>0.4 && mt>30e3 && (mT2tauLooseTau_GeV>80||mT2tauLooseTau_GeV<0)&& met>250e3 && met<350e3) * (1) * weight * xs_weight * sf_total * weight_sherpa22_njets * 150000")
h_tree.Draw("e1")

Screenshot%20from%202018-10-26%2020-24-51

Notice the difference in number of entries and thus the error bar. The bin contents are identical but the error is much worse in the projection scenario.

Thanks before!

Cheers,
Yosse

Hi,

Making sure this post is still visible.

Hi Yosse,

The bin contents are not exactly identical - but close; see the StdDev / Mean. You could subtract the two and see the diff… But more to the point: difficult to guess without knowing your data. Can you do the following:

  1. Compare
h_proj = h_temp.ProjectionX("proj", 1, h_temp.GetYaxis()->GetNbins(), "e");

with

tree.Project("tree", "mt", "(stxe_trigger==1 && lep_pt[0]>25e3 && n_jet>3 && jet_pt[0]>25e3 && jet_pt[1]>25e3 && jet_pt[2]>25e3 && jet_pt[3]>25e3 && met>230e3 && dphi_jet0_ptmiss>0.4 && dphi_jet1_ptmiss>0.4 && mt>30e3 && (mT2tauLooseTau_GeV>80||mT2tauLooseTau_GeV<0)) * (1) * weight * xs_weight * sf_total * weight_sherpa22_njets * 150000")

(i.e. remove the supposedly identical cut on met>250e3 && met<350e3 / bin == 1). Still different number of entries?

If yes, you probably have plenty of entries in under-/overflow bins, and those are skipped in the projection (but might get counted in TTree::Project because there is no “input” under/overflow). If you use Draw() instead of Project() and show the overflows (gStyle->SetOptStats()) you can diagnose this.

Hi Axel,

Here is the comparison you advised to do:

h_proj = h_temp.ProjectionX("proj", 1, h_temp.GetYaxis()->GetNbins(), "e");

Screenshot%20from%202018-11-02%2010-52-54

For the Project histogram I add met>250e3 && met<550e3 so it is now comparable to the first histogram

h_tree2 = TH1D("tree2", "", 30, 0, 600e3)
tree.Project("tree2", "mt", "(stxe_trigger==1 && lep_pt[0]>25e3 && n_jet>3 && jet_pt[0]>25e3 && jet_pt[1]>25e3 && jet_pt[2]>25e3 && jet_pt[3]>25e3 && met>250e3 && met<550e3 && dphi_jet0_ptmiss>0.4 && dphi_jet1_ptmiss>0.4 && mt>30e3 && (mT2tauLooseTau_GeV>80||mT2tauLooseTau_GeV<0)) * (1) * weight * xs_weight * sf_total * weight_sherpa22_njets * 150000")

Screenshot%20from%202018-11-02%2011-19-54

So now both of the histograms has a met < 250 GeV && met<550e3 cut: the first one from the biny 1 to Nbin projection, the second one applied explicitly. We still see the unmatched number of entries, but now the uncertainty matches, this is very confusing. Still, I do not see how the idea of under/overflow comes into play here, because even on the previous post, there were 2 sided cut on met.

Cheers,
Yosse

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.