I am playing a bit with RDataFrames and having a problem. I have some vector<float*> branches in a TTree, and I wish to extract the Mean of indexed elements. My code so far is very simplistic (and using pyROOT). I expected that I would need to define a new column to flatten the variable, eg rdf.Define("j0","jet_pt[0]"). This is allowed but when I attempt to calculate the Mean, eg rdf.Mean("j0").GetValue(), I get the error TypeError: can not resolve method template call for 'Mean'. Do I need to somehow cast the variable in python when I create a new column, or do I need to call something to “activate” the new column? I can call rdf.Mean("jet_pt") which returns a value running over all the elements in the vector.
I am not sure how to continue and could not find a tutorial showing this.
i.e. j0 is only defined in dataframes “downstream” of the Define call. You can also just chain calls:
m = rdf.Define("j0","jet_pt[0]").Mean("j0").GetValue()
Note that I have not tested the code, but it should give you the idea.
Unfortunately PyROOT sometimes hides proper error messages that would be part of a C++ exception. The situation might be better with newer ROOT versions (in fact, please do not use RDataFrame with v6.14 if you can avoid it – the amount of fixes and improvements since then has been enormous).