RDataFrame Snapshot columnList


Please read tips for efficient and successful posting and posting code

ROOT Version: 6.18/04
Platform: Centos7
Compiler: gcc 8.3.0


I start by creating an RDataFrame from a list of files, before applying various operations:

ROOT::RDataFrame d(tree_name, list_of_files);
auto new_d = d.Filter(some_filter).Define("new_var",... ;

At this stage, I wish to save the new dataframe to a ROOT file and do so like this:

new_d.Snapshot(tree_name, new_file);

To save only specific columns, I would do:

std::vector<string> save_cols = {"old_var1","old_var2","new_var");
new_d.Snapshot(tree_name, new_file, save_cols);

or using regexp:

new_d.Snapshot(tree_name, new_file, ".*_var.*");

My question is then: how can I combine the two methods? Simply writing the list of columns save_cols as a mixture of branch names and regular expressions doesn’t work, and multiple snapshots (e.g. one of the list of normal branch names plus one for each regexp) will introduce the complication of having to merge the output trees.

Any suggestions?

Hi @Demosthene,
the answer is more regular expressions!

I suggest you test and double-check, but I think something like (old_var1|old_var2|new_var|.*_var.*) should do the trick, or maybe, to be extra pedantic, (^old_var1$|^old_var2$|^new_var$|^.*_var.*$).

Cheers,
Enrico

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.