Delphes root files and RDataFrame

ammelsayed · October 17, 2025, 6:59am

Hi,

I’m trying to preform some analysis cuts using RDataFrame from root files generated by Delphes. But I seem to encounter a problem with reading particles data from arrays. As i understand, in a delphes file, an event may contain more than one same particle, like electron or muon. If i do

df =  ROOT.RDataFrame("Delphes", filepath)
hist = df.Histo1D("Muon_size")
hist.Draw()

I get a histogram describing the distribution of muon numbers per event. For a particular event, that has say 2 muons, the Muon.PT branch contains an array that stores the pT of the two muons. When i try to the read the pT distrubution of the muons, with:

hist = df.Histo1D("Muon.PT")
hist.Draw()

I get a distribution for all muons in the file. But if i chose to look for the leading muons in pT, or subleading as follows:

hist = df.Histo1D("Muon.PT[0]")
hist.Draw()

I get the following error:

cppyy.gbl.std.runtime_error: Template method resolution failed:
  ROOT::RDF::RResultPtr<TH1D> ROOT::RDF::RInterface<ROOT::Detail::RDF::RLoopManager,void>::Histo1D(string_view vName) =>
    runtime_error: Unknown column: "Muon.PT[0]"

I’ve being trying many methods but none of them worked. Can please someone help me with this problem?

ROOT Version: 6.32.12
Platform: Linux
Compiler: PyROOT

bellenot · October 17, 2025, 8:03am

Welcome to the ROOT Forum!
Maybe @vpadulan can help here

silverweed · October 17, 2025, 8:09am

Hi @ammelsayed,

Not a RDF expert, but could you try this?

hist = df.Define("LeadingMuon", "Muon.PT[0]").Histo1D("LeadingMuon")
hist.Draw()

ammelsayed · October 17, 2025, 8:20am

Hi @silverweed

It actually worked! But why, this is so strange !!!

silverweed · October 17, 2025, 8:33am

The reason is that Histo1D accepts a column name and not an expression, so you first need to Define a new column (which can use arbitrary expressions, including accessing other columns) and then you can do operations on the newly defined column.
Think about Define as “assigning a variable” that is then available for all further operations

ammelsayed · October 17, 2025, 8:42am

This really helpt me. Thank you very much !!!

vpadulan · October 17, 2025, 9:14am

Dear @ammelsayed ,

Thanks for reaching out! I absolutely agree with @silverweed , that is the right way to go. Just to give further context, allowing Histo1D("Muon.PT[0]") would be equivalent to allowing Histo1D(“run_my_very_complicated_function_that_may_return_a_value_not_acceptable_for_a_histogram”), so RDataFrame really prefers making this distinction between a column (either on-disk or defined) and an expression, the latter being only usable in the parts of API that transform data (e.g. Define or Filter) but not in the parts of the API that declare a result (e.g. Histo1D, Mean, Sum etc.).

Cheers,

Vincenzo

ammelsayed · October 20, 2025, 6:14am

Dear @vpadulan

Thanks a lot for the clear explanation, this really helps me. RDF is really great for making quick analysis, its very helpful!.

system · November 3, 2025, 6:14am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.