RDataFrame in python loop

ROOTer1 · September 20, 2021, 1:57pm

Dear,

I heard that RDataFrame is much efficient than classic loops, but I’m not sure how to use it for a simple code as in python:

chain = ROOT.TChain()
chain.Add(Dataset)
h = ROOT.TH1D(“h”,"", n, x_min, x_max)
lvmu1=ROOT.TLorentzVector()
lvmu2 =ROOT.TLorentzVector()
for ev in chain:
lvmu1.SetPtEtaPhiE(ev.pt_muon[0], ev.eta_muon[0], ev.phi_muon[0], ev.E_muon[0])
lvmu2.SetPtEtaPhiE(ev.pt_muon[1], ev.eta_muon[1], ev.phi_muon[1], ev.E_muon[1])
lvmu3 = lvmu1+ lvmu2
h.Fill(lvmu3.Pt())

bellenot · September 20, 2021, 2:57pm

From your example, it’s not clear to understand what you’re trying to do, but maybe @etejedor can help you

ROOTer1 · September 20, 2021, 3:33pm

Hi Bertrand! I’m trying to fill a histogram with a calculated transverse momentum from the sum of two vectors

etejedor · September 20, 2021, 3:48pm

Hello,

Please have a look at the RDataFrame docs first:

https://root.cern/doc/master/classROOT_1_1RDataFrame.html

With RDataFrame, you don’t iterate over the events explictly (this is hidden inside RDataFrame and done much more efficiently in C++). You proceed by applying per-event transformations to your dataset and finally obtaining some results from it.

In your code, you would construct an RDataFrame from your dataset, then for example define a new column pt from a function that does the per-event calculation you posted here. After that, you would call a Histo1D action on that column.