TTree::Draw has an easy to use syntax, but it loops over all event entries for each plot. For large dataset and a large number of plots, this is inefficient due to IO time, which generally trumps CPU time. One can write a macro but of course it doesn’t have the nice syntax of TTree::Draw and is more difficult to maintain, or to port from one analysis to another.

I wonder how TTree/TChain implements parsing of selection string and testing a selection on an event. I would think it should be easy to even allow filling histogram with similar functionality. I am thinking about pseudo codes like the following:

tree->RunDraw() //where it fills all histograms

or (if an explicit loop is needed)

for (...[getting entries]..){ 

or even just (so that there is no need to write complicated nested if structures.)
if {event->cut(selection1)} histo1->Fill(var1)

In short, is there currently an easy way to fill histograms which has the following feature:

  • simple selection string syntax
  • scalable with large number of histograms
  • (possibly) pyroot support

There were similar questions before:

TDataFrame seems to be working towards that direction, but it baffles me that it doesn’t really do string parsing like TTree::Draw. How hard is it to reuse the selection string feature?

RDataFrame supports transformations (e.g. Filter and Define) expressed as strings. These strings cannot be the same used by TTree::Draw and must be proper C++. You can see several examples here.


How hard is it to reuse the selection string feature?

It seemed to be quite hard the one time I looked into it – the string parsing is entangled with TTree-specific logic like TBranch access and so forth…but I would love to be proven wrong :slight_smile:

