Fill RVecs with Branch Values (as Array)

Hello,

Is there a way to fill RVecs with Branch Values. For example, I am trying to plot the histogram of the invariant mass. Below is my code showing my attempt to do so. (Note: my tree, called Delphes;4, has branches Muon_PT, Muon_Eta, Muon_Phi, and Muon_Charge. )

#include <vector>
#include <string>
#include "ROOT/RDataFrame.hxx"
#include "ROOT/RVec.hxx"
#include "ROOT/RDF/RInterface.hxx"
#include "TCanvas.h"
#include "TH1D.h"
#include "TLatex.h"
#include "TLegend.h"
#include "Math/Vector4Dfwd.h"
#include "TStyle.h"
using namespace ROOT::VecOps;
const auto z_mass = 91.2;

void selectMuon()
{
  ROOT::EnableImplicitMT();
  ROOT::RDataFrame df("Delphes;4", "tag_1_delphes_events.root");

TH1F *histDiMuonMass = new TH1F("mass", "M_{inv}(Z[3]Z[5]); M_inv (GeV/c^2); Events", 50, 0.0, 1500);
   RVec<float>  Muon_PT;
   RVec<float>  Muon_Eta;
   RVec<float>  Muon_Phi;
   RVec<int>  Muon_Charge;

  //auto idx_cmb = Combinations(Muon_PT, 2);

  for (size_t i = 0; i < Muon_Charge.size(); i++)
  {                
      if(Muon_Charge[1] != Muon_Charge[2])
      {
        ROOT::Math::PtEtaPhiMVector m1((Muon_PT)[1], (Muon_Eta)[1], (Muon_Phi)[1], 0.1);
        ROOT::Math::PtEtaPhiMVector m2((Muon_PT)[2], (Muon_Eta)[2], (Muon_Phi)[2], 0.1);
        
        auto mass = (m1 + m2).M();
        histDiMuonMass->Fill(mass);
        auto df_mass = df.Define("Dimuon_mass", InvariantMass<float>, {"Muon_PT", "Muon_Eta", "Muon_Phi", "m"});
   // Make histogram of dimuon mass spectrum
   auto h = df_mass.Histo1D({"Dimuon_mass", "Dimuon_mass", 30000, 0.25, 300}, "Dimuon_mass");   
}

} // end of event for loop
  histDiMuonMass->Draw();
  }

However, my histogram is empty and I figured that it is because my RVecs are empty. Is there a way to access Muon_PT, Muon_Eta, etc? I am trying to define that my RVecs have my branch values, so that I can call them later on when I define my vectors m1 and m2. Or since I would be treating Muon_PT, Muon_Charge, etcs as an array is there another way to define the Branch values instead of using RVecs? How can I do so?
Many thanks!

Hi,
if I read the code correctly, the problem is that you declare Muon_charge (at which point it is an empty RVec) and then you loop over it with for (size_t i = 0; i < Muon_Charge.size(); i++). Your code will never enter that loop.

Also, you are creating the histogram as h = df_mass.Histo1D(...) but then you are plotting a different histogram, histDiMuonMass, that you filled manually (but, again, the code never entered the event loop). Please make sure you understand what your code does, C++ can be tricky!

RDataFrame takes care of running the event loop for you. You only have to Define the columns you want, Filter the entries you want and get the results out. This is all the code you need, except for my comments in-code:

ROOT::RDataFrame df("Delphes;4", "tag_1_delphes_events.root");
// is `m` a valid column? and are the columns called Muon_PT or Muon.PT? You can check with `df.GetColumnNames()`, which returns a vector of strings of valid column names.
auto df_mass = df.Define("Dimuon_mass", InvariantMass<float>, {"Muon_PT", "Muon_Eta", "Muon_Phi", "m"});
auto h = df_mass.Histo1D({"Dimuon_mass", "Dimuon_mass", 30000, 0.25, 300}, "Dimuon_mass");
h->Draw();

Hope this helps!
Enrico

Thanks for the response. So, I see now how I can work out the invariant mass histogram.
I just also wanted to write it in a more general form as I found here in this tutorial https://root.cern.ch/doc/master/df103__NanoAODHiggsAnalysis_8C.html

That is why I left the for loop and attempted to define the RVecs. However, when I am writing this part here of the tutorial

// Find first lepton pair with invariant mass closest to Z mass
auto idx_cmb = Combinations(pt, 2);

auto best_mass = -1;
 
size_t best_i1 = 0; size_t best_i2 = 0;
 
for (size_t i = 0; i < idx_cmb[0].size(); i++) {
 
 const auto i1 = idx_cmb[0][i];
 
 const auto i2 = idx_cmb[1][i];
 
 if (charge[i1] != charge[i2]) {

   ROOT::Math::PtEtaPhiMVector p1(pt[i1], eta[i1], phi[i1], mass[i1]);
   ROOT::Math::PtEtaPhiMVector p2(pt[i2], eta[i2], phi[i2], mass[i2]);
 
const auto this_mass = (p1 + p2).M();

That is where I get the error that

Processing selectMuon.c…
Warning in TROOT::Append: Replacing existing TH1: mass (Potential memory leak).
terminate called after throwing an instance of ‘std::runtime_error’
what(): Cannot make unique combinations of size 2 from vector of size 0.

and I figure that is because it is not accessing my branch values and I am not sure how to define (in my RVecs). Do you know how to solve this?

Hi,
please check out this post about embedding code in your posts.

The problem is that the function reco_zz_to_4l, the body of which you copied, is not meant to be executed standalone. It needs to be passed to RDataFrame so RDataFrame will execute it during the event loop. Example:

int always_two() { return 2; }

int main() {
  ROOT::RDataFrame df("tree", "file.root");
  auto df2 = df.Define("two", always_two);
  return 0;
}

Here always_two is a standalone function, just like reco_zz_to_4l in the tutorial you linked. And just like in the tutorial, we pass the function to RDF’s Define, so RDataFrame knows that whenever it needs the column called "two" it can produce it by invoking that function.

In the tutorial, reco_zz_to_4l is passed to RDataFrame like this:

   auto df_z_idx = df_base.Define("Z_idx", reco_zz_to_4l, {"Muon_pt", "Muon_eta", "Muon_phi", "Muon_mass", "Muon_charge"});

where the third argument to Define is a list of column names that reco_zz_to_4l must be applied to in order to produce column Z_idx.

In summary: you must not work with RVec’s yourself, you have to define functions that work with RVec’s and tell RDataFrame what to use them for. The RDataFrame crash course might help further.

Cheers,
Enrico

I followed your correction. However, I still get the following error (along with a segmentation violation)

In file included from input_line_8:1:
/mnt/c/1/MG5_aMC_v2_6_6/triplet2s/Events/run_01/selectMuon1.c:25:1: error: function definition is not allowed here
{
^
/mnt/c/1/MG5_aMC_v2_6_6/triplet2s/Events/run_01/selectMuon1.c:61:1: error: function definition is not allowed here
{
^
/mnt/c/1/MG5_aMC_v2_6_6/triplet2s/Events/run_01/selectMuon1.c:78:1: error: function definition is not allowed here
{
^
/mnt/c/1/MG5_aMC_v2_6_6/triplet2s/Events/run_01/selectMuon1.c:89:39: error: use of undeclared identifier 'reco_zz_to_4l'
   auto df_z_idx = df.Define("Z_idx", reco_zz_to_4l, {"Muon_PT", "Muon_Eta", "Muon_Phi", "muon_m...
                                      ^
/mnt/c/1/MG5_aMC_v2_6_6/triplet2s/Events/run_01/selectMuon1.c:91:47: error: use of undeclared identifier 'compute_z_masses_4l'
   auto df_z_mass = df_z_idx.Define("Z_mass", compute_z_masses_4l, {"Z_idx", "Muon_PT", "Muon_Et...
                                              ^
/mnt/c/1/MG5_aMC_v2_6_6/triplet2s/Events/run_01/selectMuon1.c:93:48: error: use of undeclared identifier 'compute_higgs_mass_4l'
   auto df_h_mass = df_z_mass.Define("H_mass", compute_higgs_mass_4l, {"Z_idx", "Muon_PT", "Muon...
                                               ^

I still do not understand what the problem is in this case. I attached my code (which should be like the tutorial now). Do you know how I can remedy this issue. selectMuon1.c (4.2 KB)

The error messages are quite clear: you can’t define functions inside other functions in C++, they need to be at global scope (i.e. outside of the selectMuon1 function).
It might help to start from the actual code of the tutorial, make sure you can run that correctly, and modify it bit by bit (making sure, at every step, that it still compiles and runs correctly) to make it do what you need.

@eguiraud That helps. Thank you very much.
However, one other thing is that when I go on to draw my histogram I get an a segmentation violation error only (and no other error explaining what I am doing wrong). I attached my code and the segmentation violation. Do you know what is wrong and how I can fix it?
selectMuon1.c (6.7 KB)

error.txt (5.1 KB)

Thank you!

Hi,
a segfault in code like that probably means that you are doing an out-of-bound access, i.e. you are reading the n-th element of an array when n-1 elements are present or less.

You can investigate “manually” with printouts or you can try to attach a debugger like gdb to your program to identify where the crash happens exactly.

Cheers,
Enrico

I notice that the code runs well right until I write df_h_sig_4mu->Draw(); And I don’t understand why

That’s because RDataFrame is lazy: it does not do anything until it has to.
In this case, you register all operations that you want done, but until you actually use any of the results, the event loop is not actually run. When you say you want to draw df_h_sig_4mu, at that point RDataFrame triggers the event loop that fills all your histograms.

The mechanism is explained in the crash course here.

@eguiraud Ok, thank you. That helped/worked. One last thing (hopefully) do you know where I can download all the header files.In this tutorial that I am working on I noticed the .hxx and .h header files contain header files of their own and those header files have their own header files etc. So, how can I just download all the header files so that I can use the tutorial most efficiently.

All the header files used by ROOT tutorials should come with your ROOT installation.

@eguiraud Using my debugger though I am getting errors that it cannot find these source files and that is why I am getting these segmentation violations. I checked my configurationPath and it is set correctly. I am not sure what is going on

That probably means that the crash happens in code that is just-in-time compiled by ROOT.

I can take a look if you can produce a minimal reproducer (the smallest amount of code that reproduces the crash) and provide some example data I can run the reproducer on.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.