RDataFrame explicitly see how often event loop was executed

Dear ROOT developers,

is there an explicit way with which I can see how often an event loop was executed? This would be super useful for me to see for debugging purposes. Maybe there is a way to output a string every time the event loop is executed?

Using RDataFrame with ROOT 6.14 and pyROOT.

Cheers,
Ben

Hi Ben,

perhaps the simplest would be too do something like this:

ROOT::RDataFrame df(myTreename, myFileName);
auto d = df.Filter([](ULong64_t e){if (0ULL == e) std::cout << "Running evtloop" << std::endl;},{"rdfentry_"});
// now start to work on 'd'. At every loop, at he first event, you'll have a message.

We plan to add a “debug mode” to rdf and we’ll keep this topic in mind during the design: thanks for the post!

Cheers,
D

Hi D,
thank you for this great idea! I tried to make a little minimum running example and unfortunately it is not working. The error message is

root [0] 
Processing test.C...
terminate called after throwing an instance of 'std::runtime_error'
  what():  Unknown column: rdfentry_

Could you kindly help me?

Cheers,
Ben
test.C (1.6 KB)

It might still be tdfentry_ (with a t) in v6.14 (v6.16 accepts both tdfentry_ and rdfentry_).

Hi thank you, that solved the issue in C++. However, I cannot get it to run in Python. I defined a function there:

showWhenRunningEventLoop = '''
  bool showWhenRunningEventLoop(ULong64_t e){
    if (0ULL == e) std::cout << "!!!!! Running evtloop" << std::endl;
    return true;
  }
'''
ROOT.gInterpreter.Declare(showWhenRunningEventLoop)

I am trying to call it via

dataFrame = dataFrame.Filter("showWhenRunningEventLoop(tdfentry_)")

Unfortunately, when trying to define a variable next, which works without the above filter, ROOT replies to me that

ROOT::RDF::RInterface<ROOT::Detail::RDF::RJittedFilter,void> ROOT::RDF::RInterface<ROOT::Detail::RDF::RJittedFilter,void>::Define(experimental::basic_string_view<char,char_traits<char> > name, experimental::basic_string_view<char,char_traits<char> > expression) =>
The main RDataFrame is not reachable: did it go out of scope? (C++ exception of type runtime_error)

Do you know a way to fix this?

I also tried calling

dataFrame = dataFrame.Filter("showWhenRunningEventLoop",("tdfentry_"))

but this did not work either…

Cheers,
Ben

Hi,

in Python that would look like this:

import ROOT 

d = ROOT.ROOT.RDataFrame(2)
c = d.Filter('if(rdfentry_ == 0) {cout << "Running evtloop" << endl; return true; } return false; ').Count()
print (c.GetValue())

(in root 6.14 rdfentry_ would be tdfentry_)

Cheers,
D

1 Like

Hi thanks,

I am using following code now:

ROOT.gInterpreter.Declare("int counterEventLoop = 0;")
eventLoopCounter = dataFrame.Filter('if(rdfentry_ == 0) {cout << "Running evtloop " << ++counterEventLoop << endl; return true; } return false; ').Count()

This now additionally gives me information on how often the event loop was ran. Thank you for all of your help!

excellent!
Just remember not to go multithreaded with ROOT.ROOT.EnableImplicitMT() before making that counter atomic :slight_smile:

Cheers,
D

Thank you, that is helpful advice!

Now, I did a minimum running example, to test whether this is working well. Please find it attached. Essentially, I think it should be executing the event loop three times, if I understand RDataFrame correctly. So, I expect to see

Running evtloop 1
15.0
Running evtloop 2
15.0
Running evtloop 3
15.0

However, the output for me is

Running evtloop 1
15.0
15.0
15.0

I am sorry for bothering you again. What am I doing wrong?

minimumRunningExample.py (2.0 KB)

Hi,

the Filter which prints to screen the event loop index is attached to the eventLoopCounter node.
You should attach the subsequent Defines to it rather than to dataframe.

Cheers,
D

1 Like

Thank you for clarifying this! Silly me!

Cheers,
Ben

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.