I’ve found some unexpected behavior in RDataFrame in PyROOT when upgrading from ROOT v6.14.08 to v6.20.04. I’ve been using RDataFrame to process NanoAOD format data and simulation.
As a simple example, the following python code will produce a seg fault crash in v6.20.04 but not v6.14.08.
r = ROOT.RDataFrame("Events","file.root") r2 = r.Filter("boolBranch") r2.Snapshot('test','test.root','intBranch')
I’ve isolated branches storing bools to be problematic since I can swap in branches of other types (ints, vectors, etc) and there will be no issue. What’s perhaps most strange is that the following works fine in v6.20.04:
r = ROOT.RDataFrame("Events","file.root") r2 = r.Filter("boolBranch") r2.Snapshot('test','test.root','')
The only difference is that I haven’t specified a branch to snapshot. If I swap
boolBranch in the first code block, it will seg fault.
Are there any changes between v6.14 and v6.20 that could cause such behavior? I’m happy to send an example file privately if that would be useful but from what I’ve tested so far, I have no reason to believe the same issue won’t occur for any NanoAODv5 or v6 sample.
ROOT Version: v6.20.04 and v6.14.08
Platform: Ubuntu 18.04 LTS
Compiler: Not Provided