Dealing with columns of different lengths in RDF

Hi,

I have a RDF with columns defined for both track and truth (e.g. df.Define(“track”,“”) and df.Define(“truth”,“”). As track is from reco its of a different length to truth, and I’m wanting to subtract the track column (e.g. df.Define(“Delta_TrackTruth”,“abs(track-truth)”), obviously this returns an error about not having columns of equal size so can’t use the operator “-”, is there any way to get around this so that I can perform this operation?

Many Thanks,
Sinead

_ROOT Version: 6.28
_Platform:CentOS7
_Compiler: gcc

Dear Sinead,

thanks for the interesting question and welcome to the ROOT community!

I think that this kind of subtraction makes sense only if you match the gen information to the reconstructed one - this is something that can happen in many ways, and has to be coded in a matcher by you - this is not generic enough to be done for all use cases by RDF. Nevertheless, you have all the pieces at disposal, for example in the VecOps namespace.
The end result should be a list of reconstructed tracks (basically you have that already) and a list of matched gen truth, adapted to match that recoed info.

I hope this helps!

Cheers,
D

Thanks for this information!
I realised I have probably over generalised my question.
More specifcally if one had the phi of the track at the start of the tracking spectrometer, and then the phi of the truth extrapolated to the same position and wanted to subtract to them how would one go about this?

Hi Sinead,

What you would need is a MC truth matcher, it’s a very common practice in HEP. ROOT can provide pieces here to construct one, but this is really analysis/study specific.

This is really a bit beyond the scope of this forum, but one usually matches tracks with charged particles with at least deltaR. Out of this matcher, you should be able to get a set of matching gen particles, and related kinematic variables, in your RDF, as columns, and continue your analysis as you need to.

Cheers,
D

Dear @seley ,

Thanks for reaching out to the forum! Although I agree with Danilo in that the specifics of your analysis will need to be treated with the proper physics, I wanted to show a very abstract and simple snippet of code to deal with columns that contain collections of different sizes in the same event in RDF.

The following example (which is made explicitly pedantic and longer than necessary for demonstration purposes) shows the creation of a trivial TTree dataset with two branches and one entry each, both containing a collection. The first branch vec1 has a collection which is longer than the second branch vec2. The most important part for your use case is the ROOT::VecOps::Take function overload which takes in the number of elements you want the output collection to have and a default value in case the size of the input collection was smaller than the size you want. With this function, you can ensure that the collections from branch vec2 have always the same size as the collections from branch vec1 by calling the function with the size of the collection at that entry.

#include <ROOT/RDataFrame.hxx>
#include <ROOT/RVec.hxx>
#include <TFile.h>
#include <TTree.h>

struct Dataset
{
    constexpr static auto filename{"myfile.root"};
    constexpr static auto treename{"mytree"};
    Dataset()
    {
        TFile f{filename, "recreate"};
        TTree t{treename, treename};

        std::vector<float> vec1{1.1f, 2.2f, 3.3f, 4.4f, 5.5f};
        std::vector<float> vec2{6.6f, 7.7f};

        t.Branch("vec1", &vec1);
        t.Branch("vec2", &vec2);
        t.Fill();
        t.Write();
    }

    ~Dataset()
    {
        std::remove(filename);
    }
};

int main()
{
    Dataset dataset;
    ROOT::RDataFrame df{dataset.treename, dataset.filename};
    auto display = df.Define("vec3", [](const ROOT::RVecF &vec1, const ROOT::RVecF &vec2)
                             { return vec1 + ROOT::VecOps::Take(vec2, vec1.size(), 10.f); }, {"vec1", "vec2"})
                       .Display<ROOT::RVecF, ROOT::RVecF, ROOT::RVecF>({"vec1", "vec2", "vec3"});
    display->Print();
}

Hoping this can be of any help to you.
Cheers,
Vincenzo