Empty RNode after a return


ROOT Version: 6.36.04
Built for linuxarm64 on Aug 25 2025, 00:00:00
From tags/6-36-04@6-36-04


Hello Experts,
In the following snippet, the first print statement (rdf.Display()→Print()) inside helper function __useEnergyFilterAndGetIndices() correctly prints out the desired DataFrame.
Similarly, the last node’s print statement (filtered_in.Display()->Print()) also prints out the desired DataFrame. The node is then returned.
However, outside the helper function, it seems that the RNode filtered_out is empty.
I could not figure out why this is the case.

namespace {
  ROOT::RDataFrame hadronicShowerProfiles("Had", "data/AllShowers.root");
  ROOT::RDataFrame emShowerProfiles("EM", "data/AllShowers.root");
}
nicemc::RNG myrng;

ROOT::RDF::RNode __useEnergyFilterAndGetIndices(ROOT::RDataFrame rdf, double target_energy){

  rdf.Display()->Print();
  auto bw_target_and = [&target_energy](float engy){return TMath::Abs(engy - target_energy);};
  auto  diff_rdf = rdf.Define("difference", bw_target_and, {"energy"});
  double min_diff_value = diff_rdf.Min("difference").GetValue();

  double TOLERANCE = 1e-20;
  auto floating_point_comparison = 
    [&TOLERANCE, &min_diff_value]
    (double each_diff_value)
    {return TMath::Abs(each_diff_value - min_diff_value) <= TOLERANCE;};

  auto filtered_in =  diff_rdf.Filter(floating_point_comparison, {"difference"});
  filtered_in.Display()->Print();
  return filtered_in;
}


void getShowerProfile(
  double targetShowerEnergy, nicemc::ShowerType type, bool force_first_in_bin = false
){

  ROOT::RDataFrame rdf(0);

  switch(type){
    case nicemc::ShowerType::Hadronic:        
      rdf = hadronicShowerProfiles;
      break;
    case nicemc::ShowerType::ElectroMagnetic: 
      rdf = emShowerProfiles;
      break;
  }

  ROOT::RDF::RNode filtered_out = __useEnergyFilterAndGetIndices(rdf, targetShowerEnergy);
  filtered_out.Display()->Print();

}

I’ll answer my own question here, in case I encounter this in the future :slight_smile:

Local variables min_diff_value and target_energy have to be captured by copy.

It appears that I can capture TOLERANCE by reference and the program will still proceed as expected, but I have no idea why.
I see no reason why the lifetime of TOLERANCE should be different from that of min_diff_value.

By capturing those variables as reference and then returning the RDF node you’re causing undefined behavior as the lambdas will be executed after __useEnergyFilterAndGetIndices has already returned and the references will already by dangling at that point. So yes, capturing by copy is definitely the way to go and the fact that TOLERANCE works is pure chance :slight_smile: