I am working with RDataFrames and pyroot to analyze my data.
I want to write a function to add some columns and perform some calculations. At the end I want to return my processed DataFrame so I can use it to do some filtering for example in another function. How would I do that? I am thinking of something like this:
def DataFrameManipulation(rootfile):
tree = somefunctionwichgetsthetree(rootfile)
df = rt.RDataFrame(tree)
df = df.Define("new column", some operation....)
return df
def Filtering(df):
df = DataFrameManipulation(rootfile)
counts = df.Filter(some filtering which includes the new column defined in the above function)...
I know that the example is not working an I am sure this is pretty naiv. But you might get the idea about what I am trying to achieve. Would appreciate any help. Thanks!
or equivalent, so RDF will take care of opening and closing files and creating and destructing the trees. If for some reason you need to remain in charge of extracting trees from files, then you can keep the tree alive by returning it from the function together with the RDF that needs it:
def DataFrameManipulation(rootfile):
tree = somefunctionwichgetsthetree(rootfile)
df = rt.RDataFrame(tree)
df = df.Define("new column", some operation....)
return tree, df