Reviving: Access RDataFrame column in function without passing argument

Hi Lucas,

how can I define FatJet_pt before myFunc() so that it compiles but so that it can also access the value in the RDataFrame?

How can I point to it inside of myFunc() so that the value updates once RDataFrame moves onto the next row/event?

That’s just not possible, I’m afraid. You would need RDF to expose pointers to the column values that are automatically updated during the event loop – those are internals that RDF does not expose by design.

I can think of two alternative approaches.

  1. you could generate the correct invocations for users on the fly:
rdf.Define('myVar', Lcorcoframework.MakeInvMass("muon"))

where MakeInMass("muon") returns a string like InvMass(muon_pt, muon_eta, muon_phi).

  1. you could provide helpers that take and return dataframes:
rdf = Lcorcoframework.AddInvMass("muon", rdf)

where AddInvMass does something like return rdf.Define("muon_invmass", "InvMass(muon_pt, muon_eta, muon_phi)").

Examples are imprecise but I hope you get what I mean with them.

By the way you might also be interested in bamboo, a pythonic framewok for analyzing NanoAODs based on RDataFrame.

Cheers,
Enrico

1 Like