RDataFrame Histo1D to extract histogram from an indexed vector branch


I am working with a tree that has a branch that contain a vector object called “jet_pt”, and I only want to extract the 0th element and fill it to a histogram. So I do this in pyROOT,

d = ROOT.RDataFrame(tree)
Hist = d.Histo1D("jet_pt[0]")

This gives this error message,

Exception: ROOT::RDF::RResultPtr<TH1D> ROOT::RDF::RInterface<ROOT::Detail::RDF::RJittedFilter,void>::Histo1D<ROOT::Detail::RDF::TInferType, ROOT::Detail::RDF::TInferType>(const ROOT::RDF::TH1DModel& model, experimental::basic_string_view<char,char_traits<char> > vName, experimental::basic_string_view<char,char_traits<char> > wName) =>
    Unknown column: jet_pt[0] (C++ exception of type runtime_error)

It seems that RDataFrame cannot process the index. In TTree this can be done with,

tree.Project(Hist, "jet_pt[0]")

I was assuming RDataFrame would have that functionality. Is there any way around this?
Thanks before!

ROOT Version: 6.14
Platform: lxplus
Compiler: Not Provided


I think you could do either of the following

auto rdf = d.Define(“jetpt0”, “jet_pt[0]”);
Hist = rdf.Histo1D(“jetpt0”);


Hist = d.Histo1D({“hjet_pt0”, “leading jet pt”, 100, 0, 500}, “jet_pt[0]”);

1 Like

I’ll make it clearer in the reference guide: the argument passed to Histo1D must be a column name, not an expression. Indeed as @Suyong_Choi says you need

auto h = df.Define("jet0", "jet_pt[0]").Histo1D("jet0");

to do what you want. It’s more verbose but also overall more powerful/flexible.
The syntax you used, d.Histo1D("jet_pt[0]") is syntactic sugar that we might add in the future.


1 Like

I can confirm that the following solution works!

auto h = df.Define("jet0", "jet_pt[0]").Histo1D("jet0");

Thank you, Suyong and Enrico!

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.