Options for TDataFrame.Report

While the Report() method of the new TDataFrame is quite nice to see the efficiency of the different cuts, it could need some extra info and maybe some customization options.

As a reference, this is how the output looks like on my ntuple:

mass      : pass=4775994    all=11518927   --   41.462 %
L0        : pass=3620861    all=4775994    --   75.814 %
HLT1      : pass=1724278    all=3620861    --   47.621 %
kin mu+   : pass=1577263    all=1724278    --   91.474 %
kin mu-   : pass=1444699    all=1577263    --   91.595 %
TrChi2 mu+: pass=1444025    all=1444699    --   99.953 %
TrChi2 mu-: pass=1443390    all=1444025    --   99.956 %
HLT2      : pass=1376755    all=1443390    --   95.383 %
signal    : pass=971088     all=1376755    --   70.535 %

What would be nice would be one last line sumarizing the whole thing, to see the overall efficiency of the selection chain:

selected.Report("summary")
SUMMARY   : pass=971088     all=11518927   --    8.430 %
1 Like

Hi,
thanks for the suggestion, I think it is a good idea but other things take priority so it might take a while for this feature to land (pull requests are welcome :slight_smile: ).

In the meanwhile, it’s not much effort to build the summary yourself. Something like this should do the trick:

TDataFrame d("data.root", {"branch1", "branch2"});
auto all = d.Count();
auto passed = d.Filter(filter1, "f1").Filter(filter2, "f2").Count();
passed.Report();
std::cout << "SUMMARY: " << *passed * 1. / *all << std::endl;

OK, fair enough. That works as a workaround for now. :slight_smile:

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.