How to remove or exclude one branch in RDataFrame in python when reading from tuple file?

Patrick_Wu · December 21, 2021, 12:52pm

Please read tips for efficient and successful posting and posting code

ROOT Version: 6.24/06
Platform: Not Provided
Compiler: Not Provided

My tuple file contains “branch0”, “branch1”, “branch2”, … in total 100 branches, and I want to exclude “branch0” in RDataFrame.

from ROOT import RDataFrame

df = RDataFrame("DecayTree", "tuple.root")
if df.HasColumn("branch0"):
  do_something_to_remove_branch0()

I have tried

df = RDataFrame("DecayTree", "tuple.root", {"branch1", "branch2"})

but df.HasColumn("branch0") still gives True.

Can anyone advise how I can do this?

eguiraud · December 21, 2021, 12:54pm

Hi @Patrick_Wu ,
RDataFrame won’t read nor do anything with branch0 if you don’t ask it to, i.e. RDF ignores branch0 by default. Then again, of course if asked whether branch0 exists in the dataset it will say “yes”.

What are you trying to do exactly? E.g. do you want to Snapshot all branches into a new file except that one?
Cheers,
Enrico

Patrick_Wu · December 21, 2021, 1:39pm

Hi Enrico,

Thanks a lot for the swift reply!

I want to redefine branch0. If I run

df.Define("branch0", some_expression)

it will tell me

branch "branch0" already present in TTree

eguiraud · December 21, 2021, 1:43pm

Alright, the best way to do that is to use Redefine: df.Redefine("branch0", some_expression).

Redefine is currently only available in ROOT master and in our nightly builds, but it will soon be released in v6.26 (next month most likely).

With older ROOT versions there is really no good workaround I’m afraid. You are forced to use a different name for the Define’d branch or to create a new dataset that’s a copy of your original dataset that does not contain “branch0”.

Cheers,
Enrico

Patrick_Wu · December 21, 2021, 1:47pm

Understood. Thanks a lot for the explanation. I’m looking forward to the release of ROOT v6.26!

system · January 4, 2022, 1:47pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.