RDataFrame Snapshot on NANOAOD

Dear experts,

I am using ROOT 6.20/02 (6.24/06 behaves similarly) to test a simple code that creates a snapshot of a nanoAODv9 file, once keeping all branches, and once only a subset of them.

The input file is:

The snippet using the dataframe object df is:

branches = ROOT.std.vector("string")()
branches.push_back(str('Jet_pt')) # original branch name, as stored in NanoAOD
branches.push_back(str('JetPt')) # renamed branch, as defined below

df.Snapshot('Events', 'snapshot_test.root', branches)

This fails with the error:

Error in TBranch::TBranch: Illegal leaf: Jet_pt/Jet_pt[nJet]/F. If this is a variable size C array it’s possible that the branch holding the size is not available.
*** Break *** segmentation violation

However, removing the original branch ‘Jet_pt’ in ‘branches’ solves the problem.

Similarly, not specifying any branches in snapshot also works (but this defies the goal at hand).

I guess the issue is that the native nanoAOD branches are not vectors, and somehow snapshot is not handling them properly, while creating an alias in RDF somehow solves this issue (which is an ugly workaround).
Is there any other way other than this workaround to slim a nanoAOD TTree?


I guess @eguiraud can help you.

On my phone, but try always adding the array counter branch to the snapshot branchlist first, nJet in this case, and see what happens. Why things may work when an Alias is created, I could only speculate.

1 Like

Hi @hsaka ,
indeed for now the solution is to add nJet to your branches list (before Jet_pt).

This is rdf.Snapshot columnList - automate size columns · Issue #6932 · root-project/root · GitHub and we will make it so that manually adding size columns for variable-sized C-style arrays won’t be necessary.


Thanks, this does the trick!