Opening multi-dim arrays in RDataFrame as flattened 1-dim array using pyROOT

Hi ROOT forum,
I am trying to update my code to use RDataFrames for the first time and am struggling to access a certain branch containing multi-dimensional arrays in my tree. I am able to access it the “standard” way in an event loop, and I am aware of other posts and the open ticket (ROOT-9509) where it is pointed out that multi-dimensional branches are currently not supported with RDataFrames, but need to be flattened to 1-dim branches. This workaround is fine for me, I just couldn’t figure out how to do this based on the answers provided in other posts, nor could I find an example of how to implement this for RDataFrames (rather than for a TTreeReader) and in python (rather than c++). I’m very sorry about my limited coding knowledge in case this should be straightforward to see. Here’s a minimal example that first shows how I access the branch without and then with RDataFrame:

#!/usr/bin/python3

import ROOT 

def normal_attempt(infile):
    print('\nFirst 3 events in event-loop approach:\n')
    data = ROOT.TFile.Open(infile,'read')
    tree = data.Get('default')
    for evt in range(0,3): 
        tree.GetEntry(evt)
        print(f'evt: {evt}, acl: {[tree.accum_level[s] for s in range(21)]}')

def rdf_attempt(infile):
    print('\nAttempt to access branches via rdf:\n')
    d = ROOT.RDataFrame('default',infile)
    entries1 = d.Filter('true_reaccode == 1').Count()
    print(f'{entries1.GetValue()} signal events') # for reference, this still works
    entries2 = d.Filter('accum_level>1').Count()
    print(f'{entries2.GetValue()} entries passed first 2 cuts')

if __name__ == '__main__':
    infile='test.root'
    normal_attempt(infile)
    rdf_attempt(infile)

I get the branch entries printed using the original attempt, and get the same error as seen in other posts when using the RDataFrame attempt:

First 3 events in event-loop approach:

evt: 0, acl: [2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 2, 2, 2, 3, 3, 3]
evt: 1, acl: [2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 2, 2, 2, 3, 3, 3]
evt: 2, acl: [2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 2, 2, 2, 3, 3, 3]

Attempt to access branches via rdf:

26233 signal events
Traceback (most recent call last):
  File "/Users/Katharina/src/t2k/ultimate_plotting/minimal_rdf_issue.py", line 26, in <module>
    rdf_attempt(infile)
  File "/Users/Katharina/src/t2k/ultimate_plotting/minimal_rdf_issue.py", line 20, in rdf_attempt
    entries2 = d.Filter('accum_level>1').Count()
cppyy.gbl.std.runtime_error: Template method resolution failed:
  ROOT::RDF::RInterface<ROOT::Detail::RDF::RJittedFilter,void> ROOT::RDF::RInterface<ROOT::Detail::RDF::RLoopManager,void>::Filter(basic_string_view<char,char_traits<char> > expression, basic_string_view<char,char_traits<char> > name = "") =>
    runtime_error: TTree leaf accum_level.accum_level has both a leaf count and a static length. This is not supported.

I would be very grateful if you could explain how to correctly open my accum_level branch by indicating for it to be opened as flattened 1-dim array, for the case of using RDataFrames with pyROOT. Thanks a lot in advance!

P.S.: sorry for not providing links to the existing related posts, the ticket, and my input file - the ROOT forum does not allow new users to add links to posts. Maybe I can try to provide them in a reply to this post(?).

ROOT Version: 6.26/06
Platform: MacOS 11.7.10, intel silicon

existing post: Flatten multi-dimensional array in rDataframe - #9 by SiewYan
the ticket: https://its.cern.ch/jira/browse/ROOT-9509
my example input file: CERNBox

Hello,

Thanks for the interesting post and welcome to the ROOT community!

Apologies for being perhaps thick here, but could the post you link from 2022 unblock you?

Best,
D

Thank you for your fast reply! Here’s what I tried so far following following what the poster from 2022 did:

d = d.Define('accum_level_flat', 'accum_level[21]')
d = d.Define('accum_level_flat', 'accum_level[0][21]')
d = d.Define('accum_level_flat', 'accum_level[accum_level+21]')
d = d.Define('accum_level_flat', 'accum_level[][]')
d = d.Define('accum_level_flat', 'accum_level[]')

All of those versions still got the same error with unsupported leaf count and static length… It might really just be me not being able to get one detail in the syntax right?