Multiple particles in event RDataFrame


I am new to the ROOT and trying to plot some histograms using RDataFrame. However, I encountered some difficulties. After MC generation I converted .lhe files to .root files and loaded them successfully, but I don’t know how to handle such trees that have multiple particles in a single event. I am using python and don’t know much about c++, so I assume this is the main problem. I am posting the minimal example below:

data = TChain('LHEF', 'LHEF')

c = TCanvas()
model = ROOT.RDF.TH1DModel("model","",64,0,2000)

rframe = RDataFrame(data)
rframe = rframe.Define('fourVect', 'ROOT::Math::PxPyPzE4D(Particle.Px, Particle.Py, Particle.Pz, Particle.E)').Define('pt','fourVect.Pt()').Histo1D(model, 'pt')

Here it works ok, but I want to filter and plot parameters for different particles now. So my guess e.g. for photon pt plot would be to add


before Histo1D, but the program complains. I assume the problem is that I am comparing an array of particle PIDs in a single event to an integer (since when I ran a test on Particle_size - which is integer in this tree works ok). So my question is how to extract desired particles from events with different numbers of other particles?

Thank you in advance,

EDIT: Sorry, my rootfile somehow was not attached and I thought it was.
rootfile.root (347.3 KB)

Hi Gasar8,

I think I miss something here. If your snippet works, I am not sure why the filter should not. You are working on individual particles and not collections I think. I allow myself to strip down the example:

model = ROOT.RDF.TH1DModel("model","",64,0,2000)
rframe = RDataFrame(LHEF, 'rootfile.root')
hist = rframe.Filter('Particle.PID == 22')\
             .Define('pt', 'sqrt(Particle.Px * Particle.Px + Particle.Py * Particle.Py)')\
             .Histo1D(model, 'pt')
c = TCanvas()

Hi Pnine,

thank you for your quick reply. Unfortunately, your snippet doesn’t work for me. I am using SWAN notebook and after trying out your snippet, it produced a very long error, basically the same as before. I don’t know if you tried to run it on my root file, but my file has many events, each of them consisting of multiple particles, e.g.:

Particle.PID    = 1, 2, 16, -16, 22, 2, 1
Particle.Px     = -0, 0, 80.0265, 415.613, 271.311, -792.821, 25.8705
Particle.Py     = 0, -0, -92.0465, -122.008, 857.123, -639.156, -3.91291
Particle.Pz     = 2087.99, -2733.02, 157.537, 503.023, -1818.83, 1008.43, -495.19
Particle.E      = 2087.99, 2733.02, 199.236, 663.817, 2028.9, 1433.18, 495.881

so I assume, that when only defining the objects, RDataFrame is kind of able to extract all of the particles in events and then plotting all of them for a given wanted parameter. Anyway, if I try to extract only one specific kind of particle from the event, the problem occurs - as I said, I assume that Filter can not compare only an integer (e.g. PID = 22) to an array of particles like Particle.PID = 1, 2, 16, -16, 22, 2, 1. Hope I am clear enough.


I was confused by the name “Particle”: indeed it is a soa, which contains data for many particles!
So if you want to select photons only, you’ll need to use array syntax, for example, suppose you want to fill a histogram with the Pt of photons only, the core part should be:

hist = rframe.Define('pt', 'sqrt(Particle.Px * Particle.Px + Particle.Py * Particle.Py)')\
             .Define('photonPt', 'Take(pt, Nonzero(Particle.PID == 22))')\
             .Histo1D(model, 'photonPt')


1 Like

Wow, that is awesome, thank you very much. :slight_smile:
Do you maybe have an idea how to also sort same particles by their properties, e.g. if I want to plot the invariant mass of two leading jets, how do I determine/extract the two with highest energy?