RDataFrame: is there a way to know (initial) number of entries without running the loop?

Dear Experts,

Is there a way to know (initial) number of entries in RDataFrame without explicit loop?
I use RDataFrame to analyse a very large TChain.
Of course I can get number of event from TChain object,

but in my code I have a (python) function that gets as input RDataFrame object,
and it will be nice to get the number of entries from the frame itself.
Surely I can rely of frame.Count() but it triggers the loop.

In other words. is there a way to get number of entries from RDataFrame before running the actual loop?

def my_fun ( frame ) : 
 
       nentries =  .... ? ## is there a way to get number of entries here?  
 
    
       variables = ...   
       return frame.Book  ( std.move ( MyAction ( ... )  ) , variables )
      

If getting the info from the TChain itself is not good enough for you, I guess @vpadulan can help. And I suppose you already had a look at ROOT: ROOT::RDF::RInterfaceBase Class Reference

Dear @bellenot
Thank you for the prompt reply.

Yes you are right - currently I am getting number of events from initial TChain (that is out of the scope of my function) but I’d like to keep my code a bit more generic, without loop-back to the initial TChain object.

I’ve inspected RInterfaceBase and RNodeBase and I’ve found no obvious candidates.

1 Like

Hi Vanya,

Unfortunately, we do not currently plan to add this feature to RDF, to keep the interface minimal and as generic as possible. I hope you can do something with chains. If not, would you like perhaps to share more about your use case if it cannot be catered with RDF?

Cheers,
D

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.