Multiple particles in event RDataFrame

Hi,

I am new to the ROOT and trying to plot some histograms using RDataFrame. However, I encountered some difficulties. After MC generation I converted .lhe files to .root files and loaded them successfully, but I don’t know how to handle such trees that have multiple particles in a single event. I am using python and don’t know much about c++, so I assume this is the main problem. I am posting the minimal example below:

data = TChain('LHEF', 'LHEF')
data.AddFile('rootfile.root')

c = TCanvas()
model = ROOT.RDF.TH1DModel("model","",64,0,2000)

rframe = RDataFrame(data)
rframe = rframe.Define('fourVect', 'ROOT::Math::PxPyPzE4D(Particle.Px, Particle.Py, Particle.Pz, Particle.E)').Define('pt','fourVect.Pt()').Histo1D(model, 'pt')
rframe.Draw()
c.SetLogy()
c.Draw()

Here it works ok, but I want to filter and plot parameters for different particles now. So my guess e.g. for photon pt plot would be to add

.Filter('Particle.PID==22')

before Histo1D, but the program complains. I assume the problem is that I am comparing an array of particle PIDs in a single event to an integer (since when I ran a test on Particle_size - which is integer in this tree works ok). So my question is how to extract desired particles from events with different numbers of other particles?

Thank you in advance,
gasar8

EDIT: Sorry, my rootfile somehow was not attached and I thought it was.
rootfile.root (347.3 KB)

Hi Gasar8,

I think I miss something here. If your snippet works, I am not sure why the filter should not. You are working on individual particles and not collections I think. I allow myself to strip down the example:

model = ROOT.RDF.TH1DModel("model","",64,0,2000)
rframe = RDataFrame(LHEF, 'rootfile.root')
hist = rframe.Filter('Particle.PID == 22')\
             .Define('pt', 'sqrt(Particle.Px * Particle.Px + Particle.Py * Particle.Py)')\
             .Histo1D(model, 'pt')
c = TCanvas()
hist.Draw()
c.SetLogy()
c.Draw()

Hi Pnine,

thank you for your quick reply. Unfortunately, your snippet doesn’t work for me. I am using SWAN notebook and after trying out your snippet, it produced a very long error, basically the same as before. I don’t know if you tried to run it on my root file, but my file has many events, each of them consisting of multiple particles, e.g.:

...
Particle.PID    = 1, 2, 16, -16, 22, 2, 1
Particle.Px     = -0, 0, 80.0265, 415.613, 271.311, -792.821, 25.8705
Particle.Py     = 0, -0, -92.0465, -122.008, 857.123, -639.156, -3.91291
Particle.Pz     = 2087.99, -2733.02, 157.537, 503.023, -1818.83, 1008.43, -495.19
Particle.E      = 2087.99, 2733.02, 199.236, 663.817, 2028.9, 1433.18, 495.881
...

so I assume, that when only defining the objects, RDataFrame is kind of able to extract all of the particles in events and then plotting all of them for a given wanted parameter. Anyway, if I try to extract only one specific kind of particle from the event, the problem occurs - as I said, I assume that Filter can not compare only an integer (e.g. PID = 22) to an array of particles like Particle.PID = 1, 2, 16, -16, 22, 2, 1. Hope I am clear enough.

Hi,

I was confused by the name “Particle”: indeed it is a soa, which contains data for many particles!
So if you want to select photons only, you’ll need to use array syntax, for example, suppose you want to fill a histogram with the Pt of photons only, the core part should be:

hist = rframe.Define('pt', 'sqrt(Particle.Px * Particle.Px + Particle.Py * Particle.Py)')\
             .Define('photonPt', 'Take(pt, Nonzero(Particle.PID == 22))')\
             .Histo1D(model, 'photonPt')

Cheers,
P

1 Like

Wow, that is awesome, thank you very much. :slight_smile:
Do you maybe have an idea how to also sort same particles by their properties, e.g. if I want to plot the invariant mass of two leading jets, how do I determine/extract the two with highest energy?