My dataset contains certain events with missing variables (columns in RDataFrame).
Using PyROOT, I want to create a Filter such that for events missing this variable I can catch this error and create the missing column with some user-defined value.
When the Filter encounters an event with the missing column, python gives the exception:
TypeError: can not resolve method template call for 'Filter'
which by itself is not very helpful.
However, I see that RDF already prints an error message:
input_line_43:1:46: error: use of undeclared identifier 'GenModel_YMass_125'
which identifies the missing column name.
My question is: How can I retrieve the name of the column which RDF already identifies instead of writing my own function with pattern matching etc.?
At the moment the Filter method I use is https://root.cern/doc/master/classROOT_1_1RDF_1_1RInterface.html#af415d0a369aaa449492563f47a13fd37
with a simple C++ expression of the form
GenModel_YMass_125==1
but our use case also contains more complicated forms.
To be specific, I would like PyROOT to get the error message (catch exception ), store the
undeclared identifier ‘GenModel_YMass_125’ in a variable (say ‘missing_column’) and call Define(missing_column, user_speficied_value).
Note, that the HasColumn check is cumbersome since I have to identify the column name from the filter expression first. This is what I am trying to avoid.
Please read tips for efficient and successful posting and posting code
ROOT Version: 6.20/04
Platform: lxplus7
Compiler: Not Provided
Python Version: 2.7.5