Swapping columns in RDataFrame

I’m trying to swap the names of several columns in an RDataFrame to make the data consistent with the format that is expected by another script as the branch prefixes are defined the other way around. My first guess was to try

# rdf contains Dp_l1_P and Dp_l2_P
new_rdf = rdf.Alias('Dp_l1_P_bak', 'Dp_l1_P') \
    .Alias('Dp_l1_P', 'Dp_l2_P') \
    .Alias('Dp_l2_P', 'Dp_l1_P_bak')

but this doesn’t work due to the redefinition of Dp_l1_P. I guess this would need ROOT-10165.

Is there a way to do this? Or do I have to rename and write a temporary file without the original columns?

Hi Chris,
the only way to do it currently is with the extra step that you describe, a temporary dataset that prevents name clashes. Indeed ROOT-10165 will fix this.

Instead of writing an intermediate file you could also save an intermediate dataset to RAM with Cache (equivalent of Snapshot but keeps everything in-memory) – if data fits in RAM.

Cheers,
Enrico

Thanks Enrico. I’d missed Cache when looking at the docs. It might be useful to make Snapshot mention Cache as the in-memory equivalent.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.