RDataFrame and Friend Trees

Dear experts,

I have an RDataFrame question, regarding the creation of new variables and/or use of Friend Trees.

The specific use-case I have in mind is as follows:

I have an input Tree with many variables, and I add a computationally expensive new variable using an alias (i.e. has slow run time) which I would like to reuse for some number of future iterations without re-calculating it, but would not want to add to the primary input Tree since it is still in flux.

It is of course possible to take a snapshot and create a new Tree with N+1 variables (and switch to this for the next bunch of quick iterations) but this would double the used disk space.

Instead, I am trying to find a way to store this new variable as a minimal Friend Tree (following suggestions on the RDataFrame main page), and add it to the main Tree so one can perform quick iterations with this new variable until one decides to recompute it.

The issue is, as far as I understand, the order of the events in the input and Friend tree won’t match when MT is enabled (and it is necessary to use MT for performance reasons).
Is there a recommended way to proceed in this sort of scenario?
Any other non-friend-tree suggestions are also welcome of course.

Thanks for your time!

Hi Texaner :slight_smile:,
I believe you listed all available options together with their drawbacks:

  • write out the friend from a single-thread run, at the cost of runtime
  • write out a new tree with the extra column, at the cost of disk space (and possibly some runtime)
  • recalculate the quantity every time you process the tree, at the cost of runtime

There is a third, unavailable option that is writing out a friend TTree with a TTreeIndex that associates its entries with the original tree entries. At the moment, RDataFrame does not support indexed friends unfortunately (and anyway they involve random-access into the tree’s entries, which is not very performant).

Sorry I don’t have a better suggestion!