Issues in generating dictionaries using root

Hello ROOT forum,

I am having issues with generating dictionaries. Here’s a standalone, simplified reproducer MG_reproducer.C. It looks for all udcs / anti quarks in an event and creates a vector of their 4-vectors, and saves that information at another .root file.

It has three helper functions: findPIDIndex, which looks up a particle’s index from Particle.PID, get4Vector, which creates a 4-vector, and get4VectorIndex, which creates 4-vectors from indices of particles in the Particle class.

#include <ROOT/RDataFrame.hxx>
#include <ROOT/RVec.hxx>
#include <TLorentzVector.h>
#include <iostream>
#include <vector>
#include <string>
#include <stdexcept>

std::vector<int> findPIDIndex(const ROOT::VecOps::RVec<int>& pid, const std::vector<int>& myIDs) {

    std::vector<int> ret;
    ret.reserve(pid.size()); // Reserve memory upfront for ret

    for (int i = 0; i < pid.size(); ++i) {

        for (int j = 0; j < myIDs.size(); ++j) {

            if (pid[i] == myIDs[j]) {
                ret.emplace_back(i);
            }

        }

    }

    return ret;

}

/*
    Given a particle's parameters, output its 4-vector.
    mode = 'xyze': register inputs as x, y, z and E
    mode = 'pepm': register inputs as PT, eta, phi and mass
    Default mode is 'xyze'.
*/
TLorentzVector get4Vec(
    const double input1,
    const double input2,
    const double input3,
    const double input4,
    const std::string& mode)
{
    TLorentzVector ret;
    if (mode == "xyze") {
        ret.SetPxPyPzE(input1, input2, input3, input4);
    } else if (mode == "pepm") {
        ret.SetPtEtaPhiM(input1, input2, input3, input4);
    } else {
        throw std::runtime_error("get4Vec: cannot recognise input string");
    }
    return ret;
}

/*
    Given four input vectors, take the 4-vector from some given indices. return a vector of vectors ...! this preserves the order in indices.
*/
ROOT::VecOps::RVec<TLorentzVector> get4VecIndex(
    const std::vector<int>& indices,
    const ROOT::VecOps::RVec<Float_t>& inputVec1,
    const ROOT::VecOps::RVec<Float_t>& inputVec2,
    const ROOT::VecOps::RVec<Float_t>& inputVec3,
    const ROOT::VecOps::RVec<Float_t>& inputVec4,
    const std::string& mode)
{
    ROOT::VecOps::RVec<TLorentzVector> retVector;

    TLorentzVector ret_i;
    for (int i = 0; i < indices.size(); ++i) {
        ret_i = get4Vec(
            inputVec1[ indices[i] ],
            inputVec2[ indices[i] ],
            inputVec3[ indices[i] ],
            inputVec4[ indices[i] ],
            mode);
        retVector.emplace_back(ret_i);
    }

    return retVector;
}


int MG_reproducer() {

    ROOT::EnableImplicitMT(); // Tell ROOT you want to go parallel

    /* my file set-up */
    std::string name_of_folder = "MG_0704";
    std::string path = "/isilon/data/users/jhuan166/MG5_aMC_v3_5_0/" + name_of_folder + "/Events/run_01";
    std::string f_in = "tag_1_delphes_events";
    std::string full_path = path + "/" + f_in + ".root";

    ROOT::RDataFrame df("Delphes", full_path); // Interface to TTree and TChain

    /* obtain the tag udcs and their anti */
    auto d1 = df.Define("Gen_qqx_indices", "return findPIDIndex(Particle.PID, {1, 2, 3, 4, -1, -2, -3, -4});"); // udcs and their anti

    /* obtain a vector of 4-vecs. There would be two of them. */
    d1 = d1.Define("Gen_qqx_4vecs", "return get4VecIndex(Gen_qqx_indices, Particle.Px, Particle.Py, Particle.Pz, Particle.E, \"xyze\");");

    std::string f_out = "tag_1_delphes_events_tagged";
    std::string full_path_out = path + "/" + f_out + ".root";
    d1.Snapshot("Delphes", full_path_out, {"Gen_qqx_indices", "Gen_qqx_4vecs"});


    return 0;
}

The error I get is

Error in <TTree::Branch>: The class requested (vector<TLorentzVector,ROOT::Detail::VecOps::RAdoptAllocator<TLorentzVector> >) for the branch "Gen_qqx_4vecs" is an instance of an stl collection and does not have a compiled CollectionProxy. Please generate the dictionary for this collection (vector<TLorentzVector,ROOT::Detail::VecOps::RAdoptAllocator<TLorentzVector> >) to avoid to write corrupted data.
RDataFrame::Run: event loop was interrupted

This error is repeated many times until it says

Error in <TRint::HandleTermInput()>: std::logic_error caught: Trying to insert a null branch address.

I get this error even after I tried compiling it in root, using .L MG_reproducer.C+. It compiles successfully. I am using this technique from this tutorial by Enrico Guiraud. I also get similar errors if I use a std::vector<TLorentzVector instead of a ROOT::VecOps::RVec<TLorentzVector>.

Many thanks!

Hi,

thanks for the post and welcome to the ROOT Community!

The command you are looking for is

g++ -o myscript myscript.C ./mydir/helper1.C ./mydir/helper2.C `root-config --cflags --libs --glibs`

However, that will not take care of generating the dictionaries for your classes.
What would be best, is if you can provide a standalone and simplified reproducer for the setup you are dealing with so that we can get you started.

I hope this helps!

Cheers,
D

1 Like

Thank you very much, Danilo. I agree that generating dictionaries is my issue. I am going to modify the question title and body for this purpose.

Hello Danilo,

I have updated the question itself in regards to generating dictionaries. Many thanks.

Hi,

Thanks.
I understand you want to:

  • Get 2 indices per event based on the PIDs
  • For these 2 indices, generate 2 4 vectors starting from px, py, pz, e values of all particles in the event
  • Write out the collections of the 2 indices and the 2 particles

Please find below a minimal standalone example that works and hopefully reduces the code to maintain. Note that I had to do some guesswork inventing data because I have no access to your input.

// Execute with root MG_reproducer.C+

#include <ROOT/RDataFrame.hxx>
#include <ROOT/RVec.hxx>
#include <vector>

// For the mockups
#include "TRandom.h"
#include "Math/Vector4D.h"

// For the dictionaries
#ifdef __ROOTCLING__
#pragma link C++ class ROOT::VecOps::RVec<ROOT::Math::LorentzVector<ROOT::Math::PxPyPzE4D<double>>>;
#endif

void MG_reproducer()
{
   using namespace ROOT; // for RDataFrame, RVecF/UL and Math::PxPyPzEVector
   RDataFrame df(4);

   // Mock data
   const std::vector<RVecUL> Gen_qqx_indices_v{{0, 4}, {1, 2}, {2, 3}, {3, 4}};
   auto                                genRVecF = []() {
      auto unif = []() { return (float)gRandom->Uniform(0, 16.); };
      return RVecF({unif(), unif(), unif(), unif()});
   };
   auto d1 = df.Define("pxs", genRVecF).Define("pys", genRVecF).Define("pzs", genRVecF).Define("es", genRVecF);

   /* Mockup: obtain the tag udcs and their anti */
   d1 = d1.Define("Gen_qqx_indices", [&Gen_qqx_indices_v](ULong64_t ievt) { return Gen_qqx_indices_v[ievt]; },
                  {"rdfentry_"}); // udcs and their anti

   // Construct the 4 vectors
   auto Construct4vecs = [](const RVecF &pxs, const RVecF &pys, const RVecF &pzs, const RVecF &es) {
      return Construct<Math::PxPyPzEVector>(pxs, pys, pzs, es);
   };

   // NOTE: here we create the vecs and filter them out based on the indices built above
   d1 = d1.Define("Gen_qqx_4vecs_dirty", Construct4vecs, {"pxs", "pys", "pzs", "es"})
          .Define("Gen_qqx_4vecs", "Take(Gen_qqx_4vecs_dirty, Gen_qqx_indices)");

   d1.Snapshot("Delphes", "myOutput.root", {"Gen_qqx_indices", "Gen_qqx_4vecs"});
}

I hope this helps!

Cheers,
D

PS
Next time, if you can, perhaps try to open a new post instead of editing in place the existing one significantly (but of course, thanks again for doing that - it’s no problem at all)

Hello Danilo,

Thanks for the simplified example. I made some tweaks for it to run on my end, namely

  • RVecULRVec<unsigned long>
  • RVecFRVec<float>
  • Construct<Math::PxPyPzEVector>ROOT::VecOps::Construct<Math::PxPyPzEVector>

Otherwise ROOT tells me that these names don’t exist under the namespaces provided.

I received the following error:

Error in <TTree::Branch>: The class requested (vector<ROOT::Math::LorentzVector<ROOT::Math::PxPyPzE4D<double> >,ROOT::Detail::VecOps::RAdoptAllocator<ROOT::Math::LorentzVector<ROOT::Math::PxPyPzE4D<double> > > >) for the branch "Gen_qqx_4vecs" is an instance of an stl collection and does not have a compiled CollectionProxy. Please generate the dictionary for this collection (vector<ROOT::Math::LorentzVector<ROOT::Math::PxPyPzE4D<double> >,ROOT::Detail::VecOps::RAdoptAllocator<ROOT::Math::LorentzVector<ROOT::Math::PxPyPzE4D<double> > > >) to avoid to write corrupted data.

For your information, I think this part got removed when I edited the question, but my set up is:

ROOT Version: 6.24/07
Platform: linuxx8664gcc
Compiler: g++ (GCC) 10.3.0

Meanwhile, for my original code, I added the dictionary generating bit

#ifdef __ROOTCLING__
#pragma link C++ class vector<TLorentzVector,ROOT::Detail::VecOps::RAdoptAllocator<TLorentzVector>>;
#endif

and it is running okay.

Hi,

Yes, as shown in the example, the pragma line needs to be there.
And for the typedefs, apologies, I did not notice you were using 6.24 (let me perhaps suggest to move to a more recent version to take advantage from the most recent features :slight_smile: )

Cheers,
D

1 Like

Thanks, Danilo. I will try to switch to a more recent version.

For my original code, it runs after #pragma was added, but it seems that the “vector of TLorentzVectors” gets corrupted after it is saved. When it is running, it comes with the warning

Warning in <TTree::Bronch>: Using split mode on a class: TLorentzVector with a custom Streamer

Of course, I know that TLorentzVector is an outdated class, and I will move to a newer class. Overall, thank you very much for all your help!

My pleasure. Do not hesitate to come back if you have any additional questions.

1 Like

I want to post here a follow-up. I did the following three things:

  • Update to ROOT 6.30/03
  • Use ROOT::Math::LorentzVector
  • Generate a dictionary manually using #pragma link C++ class ...

Doing these solved my problem. Now I am able to save a std::vector of LorentzVector as a branch into my new .root file and access it.

1 Like