How can I save values to .txt file with RDataFrame?

Dear experts,

I am trying to save to the .txt file values of basic variables, such as event and run number for cases, when the transverse mass is bigger than 90. How can one do that?
I was thinking about a filter, but I do not know what do I do after. F.g.

auto interesting_evts = df.Filter( [] (const float& mt, const UInt_t& run, const ULong64_t& event){
    if ( mt > 90 ) { 
      DO SOMETHING TO SAVE EVENT AND RUN NUMBERS (but how?)
    }
}, {"mt", "run", "event"});

I am not sure if " auto " is a right approach here, because it has to be called somewhere so that it gets “activated” and I do not call it anywhere…

What I want to have as an outcome is a txt file like:

MT EVENT RUN
95 2133 12948
… … …

Thank you in advance.

Best,
Newbie

You can use an open std::ofstream to write to a txt file within the lambda.

Maybe also:
https://root.cern/doc/master/classROOT_1_1RDF_1_1RInterface.html#a7fb6ccf99d69bee7c519dcfe25bfa16b

1 Like

I can use an open std::ofstream, but since my “interesting_evts” is not called anywhere, it does nothing. So, for example, if I do

auto interesting_evts = df.Filter( [] (const float& mt, const UInt_t& run, const ULong64_t& event){
    if ( mt > 90 ) { 
      std::cout << " run " << run << " event " << event << std::endl;
    }
}, {"mt", "run", "event"});

It does not print out anything, because the function is not called. So, I am curious how should one call it so that it prints out something.

Can you post of the result of:

std::cout << interesting_evts.Count() << " entries passed all filters" << std::endl;

So now I have

auto interesting_evts = df.Filter( [] (const float& mt, const UInt_t& run, const ULong64_t& event){
    if ( mt > 90 ) { 
      std::cout << " run " << run << " event " << event << std::endl;
    }
}, {"mt", "run", "event"});

std::cout << interesting_evts.Count() << " entries passed all filters" << std::endl; 

and I get a compilation error : " static assertion failed: filter expression returns a type that is not convertable to bool
static_assert(std::is_convertible<FilterRet_t, bool>::value,

The Lambda filter you defined should return a boolean.
if(mt>90) { .... return true;} else return false;

See here how to properly define filters.
https://root.cern.ch/doc/master/df001__introduction_8C.html

Well, when I define it as follows :

auto interesting_evts = df.Filter( [] (const float& mt, const UInt_t& run, const ULong64_t& event){
    if ( mt > 90 ) { 
      return true;
    }
 return false;
}, {"mt", "run", "event"});

std::cout << interesting_evts.Count() << " entries passed all filters" << std::endl; 

It still gives me a compilation error no match for " operator<<" (operand types are "std::ostream" {aka "std::basic_ostream<char>"} and "ROOT::RDF::RResultPtr<long long unsigned int>")
and note no known conversion for argument 2 from "ROOT::RDF::RResultPtr<long long unsigned int>" to "int"

Use instead:

(*(interesting_evts.Count()))

It worked and the output is that " 20 entries passed all filters ".
It gave me some hope, how can I get the " event " and " run " information for those entries?

Reintroduce

std::cout << " run " << run << " event " << event << std::endl;

just before
return true;

You can also follow the example of the link above.

namely using the function Take() or Foreach().

Or you can use Snapshot() to save a TTree, and then export it as CSV

Thank you for your replies.

Foreach does exactly what I want:

df.Foreach([](const float& mt, const UInt_t& run, const ULong64_t& event){
 if ( mt > 90 ) { 
      std::cout << " run " << run << " event " << event << std::endl;
    }
}, {"mt", "run", "event"});

Then I do not need to have a bool, it’s easy and short :smile: Thanks again!

Hi,

I’m late to the party but I’d like to suggest a slight rewrite of the basic approach you converged to, which is completely right:

  • it might be more readable to separate the logic into a Filter for selecting events, a Foreach for printing them
  • variables of fundamental types are usually passed by value in C++ (it’s faster and more readable)
  • std::endl is sometimes slower than just a \n'
auto printEvent = [](UInt_t r, ULong64_t e) {
  std::cout << r << ',' << e << '\n';
}; 

std::cout << "run,event\n"; // print CSV file header
df.Filter("mt > 90")
  .Foreach(printEvent, {"run", "event"});

Cheers,
Enrico

2 Likes