I’m seeing different results when filling data from a same TTree using 2 different methods.
# get the file (69MB) from https://cernbox.cern.ch/s/aotAznyWsQVSqxa
f = ROOT.TFile.Open("jetE.root")
tr = f.jtree
hr = ROOT.TH1D("hr","hr", 100, 0,2.)
tr.Project("hr","jet_E/jet_true_E")
print(hr.GetBinContent(44))
which gives 1187233.0
When using RDataFrame :
df = ROOT.RDataFrame(tr)
df=df.Define('r', "jet_E/jet_true_E")
h = df.Histo1D(("hr","hr", 100, 0,2.), 'r').GetPtr()
print(h.GetBinContent(44)) # this gives 1187240.0
I get 1187240.0
Am I doing something wrong ? Or is there something wrong with Project or Histo1D ?
Cheers,
P-A
ps : editing to add I also tried an entirely different method : by loading the data with uproot and then calling hr.FillN( ), I get the same result as in the 2nd case (RDataFrame case)
_ROOT Version:6.26/02
_Platform: linux Compiler: Not Provided
My suspicion is …
Both “jet_E” and “jet_true_E” are “Float_t”.
The exact value of “jet_E/jet_true_E” will depend on when something gets promoted to “Double_t” (so, it’s possible that sometimes the calculated value will land in the neighboring bin): df=df.Define('r', "Double_t(jet_E)/Double_t(jet_true_E)")
Thanks ! You mean that Project would convert to double before taking the ratio while RDataFrame is performing it using float ? And then the 7 wrong assignments would be due to the resulting numerical fluctuation around the bin boundaries…
That makes sense, but is 7 out of ~1M compatible with the difference in numerical precision between float and double ?