I do not seem to be able to define a new column with RDataFrame:
In [1]: import ROOT
In [2]: f = ROOT.TFile.Open('root://path/to/file.root')
In [3]: t = f.Get('tree name')
In [4]: df = ROOT.RDataFrame(t)
In [5]: df.Define('pt_test', 'sqrt(X_PX*X_PX + X_PY*X_PY)')
Out[5]: <ROOT.ROOT::RDF::RInterface<ROOT::Detail::RDF::RLoopManager,void> object at 0x7fb4004ab840>
In [6]: 'pt_test' in df.GetColumnNames()
Out[6]: False
In [7]: h1 = df.Histo1D('X_PX')
In [8]: h2 = df.Histo1D('pt_test')
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-8-0985558c03c3> in <module>()
----> 1 h2 = df.Histo1D('pt_test')
TypeError: can not resolve method template call for 'Histo1D'
What am I doing wrong?
ROOT Version: 6.15/01 Platform: MacOS Compiler: Not Provided
This way RDF lets you define a computation graph with complex dependencies, and you have fine-grained control on the visibility of columns.
If needed you could even define the same column for different branches of the computation graph.
See the user guide for a more detailed explanation, or ask here if I was not clear enough!
Cheers,
Enrico
Oh I see. Thank you. This usage is clear from the Define documentation, but it is sort of buried in the Crash Course, where the custom columns explanation includes a code snippet that led me to believe I was using Define correctly. Your reply makes clear what is intended by “from the point of definition onwards”.