RDataFrame Cut left out some events

jbrom · March 3, 2025, 8:27pm

Please read tips for efficient and successful posting and posting code

ROOT Version: 6.32.10
Platform: win64
Compiler: MSVC 19.39.33521.0

A couple weeks ago I was attempting to make cuts on a root file using RDF and make a new, smaller data file. I made cuts on chi^2/NDF, momentum transfer (t), proton momentum, and phi mass.

I recently made histograms of these four values and found that the cuts were fine for all but Phi mass. There were still some events in the range of 0.72-0.88 GeV, which is outside my cut (0.9-1.14 GeV). This is around the mass of the omega meson which would normally dominate the decay channel I’m looking at (3pi). Even so, these values are outside my cut so it doesn’t make much sense as to why they are still in the data after the cut.

Cutting file:
cutter_Macro.C (4.7 KB)

Cut Check:
rdf_cut_check.C (6.2 KB)

Histograms:
c1.root (11.7 KB)

Danilo · March 3, 2025, 9:29pm

Hi,

Thanks for the post!
If I understand correctly, you are using RDF to analyse a certain columnar dataset and you believe that, in some cases, events are not discarded because a cut you apply on some quantity has no effect. Please do not hesitate to correct me if I am wrong here.

If I understood correctly, could you please isolate the case where you believe the cut applied by RDF has no effect and share with us a minimal reproducer?

Best,
D

jbrom · March 3, 2025, 10:52pm

Yes, that’s correct.

Here is the mass distribution before the mass cut:

Here is the code I used to make the cuts:

using PxPyPzE = ROOT::Math::LorentzVector<ROOT::Math::PxPyPzE4D<double>>;
void cutter_Macro() {
    ROOT::EnableImplicitMT();

    auto inFileName = "flat_pi0pippim__B4.root";
    auto outFileName = "flat_pi0pippim__B4_cut.root";
    auto treeName = "pi0pippim__B4;1";
    TFile *filein = new TFile(inFileName);
    TTree *tree = (TTree *)filein->Get(treeName);

    PxPyPzE Gamma1P4, Gamma2P4, Pi0P4, ProtonP4, PiMinusP4, PiPlusP4, PhiP4;

    auto proton_4Vec = [&ProtonP4](TLorentzVector p_p4_kin) { return ProtonP4.SetPxPyPzE(p_p4_kin.Px(), p_p4_kin.Py(), p_p4_kin.Pz(), p_p4_kin.E()); };
    auto pim_4Vec = [&PiMinusP4](TLorentzVector pim_p4_kin) { return PiMinusP4.SetPxPyPzE(pim_p4_kin.Px(), pim_p4_kin.Py(), pim_p4_kin.Pz(), pim_p4_kin.E()); };
    auto pip_4Vec = [&PiPlusP4](TLorentzVector pip_p4_kin) { return PiPlusP4.SetPxPyPzE(pip_p4_kin.Px(), pip_p4_kin.Py(), pip_p4_kin.Pz(), pip_p4_kin.E()); };
    auto g1_4Vec = [&Gamma1P4](TLorentzVector g1_p4_kin) { return Gamma1P4.SetPxPyPzE(g1_p4_kin.Px(), g1_p4_kin.Py(), g1_p4_kin.Pz(), g1_p4_kin.E()); };
    auto g2_4Vec = [&Gamma2P4](TLorentzVector g2_p4_kin) { return Gamma2P4.SetPxPyPzE(g2_p4_kin.Px(), g2_p4_kin.Py(), g2_p4_kin.Pz(), g2_p4_kin.E()); };

    ROOT::RDataFrame d(treeName, inFileName);

    auto d_4Vec = d.Define("Gamma1_4Vec", g1_4Vec, {"g1_p4_kin"})
        .Define("Gamma2_4Vec", g2_4Vec, {"g2_p4_kin"})
        .Define("PiPlus_4Vec", pip_4Vec, {"pip_p4_kin"})
        .Define("PiMinus_4Vec", pim_4Vec, {"pim_p4_kin"})
        .Define("Pi0_4Vec", "Gamma1_4Vec + Gamma2_4Vec")
        .Define("Proton_4Vec", proton_4Vec, {"p_p4_kin"})
        .Define("Phi_4Vec", "PiPlus_4Vec + PiMinus_4Vec + Pi0_4Vec");

    auto t_cut = [](float Mandlestam_t) { return -Mandlestam_t < 1.; };
    auto chi2NDF_cut = [](UInt_t kin_ndf, float kin_chisq) { return (kin_chisq / kin_ndf) < 6.; };
    auto protonMom_cut = [](PxPyPzE Proton_4Vec) { return Proton_4Vec.P() > 0.3; };
    auto phiMass_cut = [](PxPyPzE Phi_4Vec) { return Phi_4Vec.M() > 0.9 && Phi_4Vec.M() < 1.14; };

    auto d_4Vec_cut = d_4Vec.Filter(chi2NDF_cut, {"kin_ndf", "kin_chisq"})
        .Filter(t_cut, {"Mandlestam_t"})
        .Filter(protonMom_cut, {"ProtonMom"})
        .Filter(phiMass_cut, {"PhiMass"});

    d_4Vec_cut.Snapshot("pi0pippim__B4_cut", "flat_pi0pippim__B4_cut.root");
}

Here is the mass distribution after the cut:

As you can see, the mass cut almost worked. There are still some events that lie outside of the cut range.

How would something like this happen?

dastudillo · March 4, 2025, 7:42am

There are also events in the P histogram with P < 0.3, so both vector-related cuts have the issue; supposing it’s not an issue on the P and M methods of TLorentzVector (by the way, you have LorentzVector and TLorentzVector; I don’t use them but maybe you should not mix them?), try simplifying your code to only read and cut one of them and see if you are reading the variables correctly in the first place; also print out details of the events that are not being cut and see if there’s anything strange.