Using RDataFrames on branches with spaces in the name

Dear ROOT experts,

I need to work with the .root files that contain trees with branches that have spaces in the name. There’s nothing I can do with how the files are produced.

auto a = tree_PFLOW->GetListOfBranches()
a->At(440)->GetName()
(const char *) "weight_var_th_nominal "

If I try to access it in the with the RDataFrame

df = ROOT.RDataFrame('tree_PFLOW', 'file.root')
df.Define('x', 'weight_var_th_nominal ')

I get the following error

input_line_73:2:28: error: use of undeclared identifier 'weight_var_th_nominal'
auto lambda0 = [](){return weight_var_th_nominal
                           ^
input_line_77:2:28: error: use of undeclared identifier 'weight_var_th_nominal'
auto lambda0 = [](){return weight_var_th_nominal
                           ^
Traceback (most recent call last):
  File "/eos/home-a/apetukho/IncZZ/throwaway_scripts/dataframe_space.py", line 9, in <module>
    df.Define('x', 'weight_var_th_nominal ')
cppyy.gbl.std.runtime_error: Template method resolution failed:
  ROOT::RDF::RInterface<ROOT::Detail::RDF::RLoopManager,void> ROOT::RDF::RInterface<ROOT::Detail::RDF::RLoopManager,void>::Define(basic_string_view<char,char_traits<char> > name, basic_string_view<char,char_traits<char> > expression) =>
    runtime_error:
RDataFrame: An error occurred during just-in-time compilation. The lines above might indicate the cause of the crash
 All RDF objects that have not run an event loop yet should be considered in an invalid state.

  ROOT::RDF::RInterface<ROOT::Detail::RDF::RLoopManager,void> ROOT::RDF::RInterface<ROOT::Detail::RDF::RLoopManager,void>::Define(basic_string_view<char,char_traits<char> > name, basic_string_view<char,char_traits<char> > expression) =>
    runtime_error:
RDataFrame: An error occurred during just-in-time compilation. The lines above might indicate the cause of the crash
 All RDF objects that have not run an event loop yet should be considered in an invalid state.

Is there any way to access such branches provided I cannot change the way the initial files are produced?

Thanks in advance,
Aleksandr


ROOT Version: 6.26/04

Hi @apetukho ,

you can try with an Alias('x', 'weight_var_th_nominal ') but admittedly this is not a case we have encountered/tested before, so I am not 100% sure it will work.

Cheers,
Enrico

Unfortunately, it doesn’t.

If I try

import ROOT

df = ROOT.RDataFrame('tree_PFLOW', 'file.root')
df.Alias('x', 'weight_var_th_nominal ')
df.Define('y', 'x')

I get the following error

input_line_75:2:28: error: use of undeclared identifier 'x'
auto lambda0 = [](){return x
                           ^
input_line_79:2:28: error: use of undeclared identifier 'x'
auto lambda0 = [](){return x
                           ^
Traceback (most recent call last):
  File "/eos/home-a/apetukho/IncZZ/throwaway_scripts/dataframe_space.py", line 5, in <module>
    df.Define('y', 'x')
cppyy.gbl.std.runtime_error: Template method resolution failed:
  ROOT::RDF::RInterface<ROOT::Detail::RDF::RLoopManager,void> ROOT::RDF::RInterface<ROOT::Detail::RDF::RLoopManager,void>::Define(basic_string_view<char,char_traits<char> > name, basic_string_view<char,char_traits<char> > expression) =>
    runtime_error:
RDataFrame: An error occurred during just-in-time compilation. The lines above might indicate the cause of the crash
 All RDF objects that have not run an event loop yet should be considered in an invalid state.

  ROOT::RDF::RInterface<ROOT::Detail::RDF::RLoopManager,void> ROOT::RDF::RInterface<ROOT::Detail::RDF::RLoopManager,void>::Define(basic_string_view<char,char_traits<char> > name, basic_string_view<char,char_traits<char> > expression) =>
    runtime_error:
RDataFrame: An error occurred during just-in-time compilation. The lines above might indicate the cause of the crash
 All RDF objects that have not run an event loop yet should be considered in an invalid state.


“undeclared identifier ‘x’” is because the 4th line should be df = df.Alias('x', 'weight_var_th_nominal ') (you have to assign the modified df).

Such a silly mistake.

Yes, the following code works perfectly fine, thank you for your help!

import ROOT

df = ROOT.RDataFrame('tree_PFLOW', 'file.root')
df = df.Alias('x', 'weight_var_th_nominal ')
df = df.Define('y', 'x')

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.