Saving the print of a ROOT::RDF::RCutFlowReport in Python


ROOT Version: 6.30/04
Platform: AlmaLinux9
Compiler: gcc13


Greetings,

I would like to save the output of a RDF.Report().Print() call to a text file in a Python program, but I’m experiencing some difficulties. For example, I have the following code and would like to get its output (in Python, so no piping in shell etc.):

import ROOT
rdf = ROOT.RDataFrame("Events", "root://eospublic.cern.ch//eos/opendata/cms/Run2016H/SingleMuon/NANOAOD/UL2016_MiniAODv2_NanoAODv9-v1/120000/61FC1E38-F75C-6B44-AD19-A9894155874E.root")
rdf = rdf.Filter("nMuon == 2", "2 muons")
rdf.Report().Print()

Since neither rdf.Report() or rdf.Report().Print() returns a string I cannot use

with open('output.txt', 'w') as file:
    file.writelines(rdf.Report().Print()) # or something similar

and for some reason redirecting the stdout to a file doesn’t also work, ie.

import sys
original_out = sys.stdout
with open("output.txt", "w") as file:
    sys.stdout = file
    print("hello") # this will be in the file
    rdf.Report().Print() # output of this will be in terminal

sys.stdout = original_out

Let me know if you have any suggestions, thank you.

Hi @toicca,
Thanks for your question.

I just tried this on my local machine, and I couldn’t reproduce the issue. I get:

>>> rdf.Report().Print()
2 muons   : pass=2713       all=14113      -- eff=19.22 % cumulative eff=19.22 %

Do you get any errors or warnings when you run:

rdf = ROOT.RDataFrame("Events", "root://eospublic.cern.ch//eos/opendata/cms/Run2016H/SingleMuon/NANOAOD/UL2016_MiniAODv2_NanoAODv9-v1/120000/61FC1E38-F75C-6B44-AD19-A9894155874E.root")

Cheers,
Dev

Hi Dev,

Thanks for the reply. The issue isn’t with the output of the .Print(), but with how to save it either as a string or directly to a text file. I get the same output as you, but would like to write it to a file without using >> or > when running the program from the terminal. How would you save the output of the method?

Best wishes
Nico

I see what you mean, I had a look at the implementation of the Print(): ROOT: tree/dataframe/src/RCutFlowReport.cxx Source File, unfortunately there is no method that would return a string. You could try to replicate the same implementation in python with something similar to the following code:

import ROOT

rdf = ROOT.RDataFrame("Events", "root://eospublic.cern.ch//eos/opendata/cms/Run2016H/SingleMuon/NANOAOD/UL2016_MiniAODv2_NanoAODv9-v1/120000/61FC1E38-F75C-6B44-AD19-A9894155874E.root")
rdf = rdf.Filter("nMuon == 2", "2 muons")

# Python implementation of the C++ Print method
def cut_info_string(report):
    begin = report.begin()
    end = report.end()
    
    allEntries = 0 if begin == end else begin.__deref__().GetAll()
    result = ""
    it = begin
    while it != end:
        ci = it.__deref__()
        name = ci.GetName()
        pass_val = ci.GetPass()
        all = ci.GetAll()
        eff = ci.GetEff()
        cumulativeEff = 100.0 * float(pass_val) / float(allEntries) if allEntries > 0 else 0.0
        
        result+=f"{name:10}: pass={pass_val:<10} all={all:<10} -- eff={eff:.2f} % cumulative eff={cumulativeEff:.2f} %"
        
        it.__preinc__()
    return result

print(cut_info_string(rdf.Report()))

Cheers,
Dev

1 Like

Thank you! I’ll use this for now. I think it would be nice to have it possible to get the print as a string in someway, so I’ve created a Github issue for a possible follow up.

Hi,

I agree on moving the discussion on GH. I replied to the discussion there.

Cheers,
D