Delphes root files and RDataFrame


Hi,

I’m trying to preform some analysis cuts using RDataFrame from root files generated by Delphes. But I seem to encounter a problem with reading particles data from arrays. As i understand, in a delphes file, an event may contain more than one same particle, like electron or muon. If i do

df =  ROOT.RDataFrame("Delphes", filepath)
hist = df.Histo1D("Muon_size")
hist.Draw()

I get a histogram describing the distribution of muon numbers per event. For a particular event, that has say 2 muons, the Muon.PT branch contains an array that stores the pT of the two muons. When i try to the read the pT distrubution of the muons, with:

hist = df.Histo1D("Muon.PT")
hist.Draw()

I get a distribution for all muons in the file. But if i chose to look for the leading muons in pT, or subleading as follows:

hist = df.Histo1D("Muon.PT[0]")
hist.Draw()

I get the following error:

cppyy.gbl.std.runtime_error: Template method resolution failed:
  ROOT::RDF::RResultPtr<TH1D> ROOT::RDF::RInterface<ROOT::Detail::RDF::RLoopManager,void>::Histo1D(string_view vName) =>
    runtime_error: Unknown column: "Muon.PT[0]"

I’ve being trying many methods but none of them worked. Can please someone help me with this problem?

ROOT Version: 6.32.12
Platform: Linux
Compiler: PyROOT


Welcome to the ROOT Forum!
Maybe @vpadulan can help here

Hi @ammelsayed,

Not a RDF expert, but could you try this?

hist = df.Define("LeadingMuon", "Muon.PT[0]").Histo1D("LeadingMuon")
hist.Draw()
1 Like

Hi @silverweed

It actually worked! But why, this is so strange :slight_smile: !!!

The reason is that Histo1D accepts a column name and not an expression, so you first need to Define a new column (which can use arbitrary expressions, including accessing other columns) and then you can do operations on the newly defined column.
Think about Define as “assigning a variable” that is then available for all further operations :slight_smile:

This really helpt me. Thank you very much !!!

1 Like

Dear @ammelsayed ,

Thanks for reaching out! I absolutely agree with @silverweed , that is the right way to go. Just to give further context, allowing Histo1D("Muon.PT[0]") would be equivalent to allowing Histo1D(“run_my_very_complicated_function_that_may_return_a_value_not_acceptable_for_a_histogram”), so RDataFrame really prefers making this distinction between a column (either on-disk or defined) and an expression, the latter being only usable in the parts of API that transform data (e.g. Define or Filter) but not in the parts of the API that declare a result (e.g. Histo1D, Mean, Sum etc.).

Cheers,

Vincenzo

1 Like