TDataFrame:Calculate variables from Branch of the TTree

Hi,

I’m starting to use the TDataFrame, I want to calculate variables
from Tbranch of the TTree using: d.Define("pt", "sqrt(px*px + py*py)");
How is the format of the command to do this?

My program

void analys(){
ROOT::Experimental::TDataFrame d("Particle","crmc_eposlhc_416075184_p_C_800.root");
        
auto hpx = d.Histo1D({"hpx", "px distribution", 100, -10, 10}, "px");
        
        auto c1 = new TCanvas("c1", "c1", 10, 10, 700, 500);
        hpx->GetYaxis()->SetTitle("dN/dpx");
        hpx->GetXaxis()->SetTitle("px");
        hpx->DrawClone();``` 
}

Cheers,
Andre

Hi Andre,

there are two ways of achieving this: with a jitted string or with a regular C++ callable such as a lambda.

Case 1:

void analys(){
ROOT::Experimental::TDataFrame d("Particle","crmc_eposlhc_416075184_p_C_800.root");
        
auto dd = d.Define("pt", "sqrt(px*px + py*py)");
auto hpx = dd.Histo1D({"hpt", "pt distribution", 100, -10, 10}, "pt");

[...]

Case 2 (supposing px and py are floats!):

void analys(){
ROOT::Experimental::TDataFrame d("Particle","crmc_eposlhc_416075184_p_C_800.root");
        
auto dd = d.Define("pt", [](float px, float py) {return sqrt(px*px + py*py);}, {"px", "py"});
auto hpx = dd.Histo1D({"hpt", "pt distribution", 100, -10, 10}, "pt");

[...]

For more details, you can have a look to this tutorial: https://root.cern/doc/master/tdf001__introduction_8C.html

Cheers,
D

Thank you for the reply!

I’m trying your two ways, but something is going wrong.

px, py are double
The first way:

using namespace std;

#include <iostream>
#include <string>

 void Analys(){
        ROOT::Experimental::TDataFrame d("Particle","crmc_eposlhc_416075184_p_C_800.root");
        auto dd = d.Define("pt", "sqrt(px*px + py*py)");
        auto hpt = dd.Histo1D({"hpt", "pt distribution", 100, 0, 4}, "pt");
        
        [...]
 }

Error:

input_line_51:3:15: error: invalid operands to binary expression ('std::array_view<double>' and 'std::array_view<double>')
return sqrt(px*px + py*py)

/opt/root6/include/TTime.h:85:14: note: candidate function not viable: no known conversion from 'std::array_view<double>' to 'const TTime' for 1st argument
inline TTime operator*(const TTime &t1, const TTime &t2)

/opt/root6/include/TVectorT.h:250:19: note: candidate template ignored: could not match 'TVectorT' against 'array_view'
TVectorT<Element> operator*   (const TVectorT <Element>  &source, Element val) { return val * source; }
                  ^
terminate called after throwing an instance of 'std::runtime_error'
  what():  Cannot interpret the following expression:
sqrt(px*px + py*py)

Make sure it is valid C++.

The Second way:
float -> double

using namespace std;

#include <iostream>
#include <string>

 void Analys(){
        ROOT::Experimental::TDataFrame d("Particle","crmc_eposlhc_416075184_p_C_800.root");
        auto dd = d.Define("pt", [](double px, double py) {return sqrt(px*px + py*py);}, {"px", "py"});
        auto hpt = dd.Histo1D({"hpt", "pt distribution", 100, 0, 4}, "pt");    
        
        [...]
 }

The second way I get graphic, but I Think pt not is calculated:

file: crmc_eposlhc_416075184_p_C_800.root


TTree    Particle          particles produced (entries=1000)
  nPart  "nPart/I"         TBranch
  pdgid  "pdgid[nPart]/I"  TBranch
  status "status[nPart]/I" TBranch
  px     "px[nPart]/D"     TBranch
  py     "py[nPart]/D"     TBranch
  pz     "pz[nPart]/D"     TBranch
  E      "E[nPart]/D"      TBranch
  m      "m[nPart]/D"      TBranch

Cheers, Andre

Hi Andre,

what you are dealing with a collections (C arrays) and not individual values. You’ll need to treat those as such.
For example:

std::vector<float> v;
using floats = std::array_view<float>;
auto ptCalc = [&v](floats pxs, floats pys) {
   v. clear(); 
   for (unsigned int i=0;i < pxs.size(); ++i) v.emplace_back(sqrt(px*px + py*py));
   return v;};

auto dd = d.Define("pt", ptCalc, {"px", "py"});

This is admittedly less than optimal. We are actively seeking the best way for TDF to handle collections like in your case.

Cheers,
D

Hi, Danilo,

Thanks for the help!

Now I understand the problem. I did your suggestion:

    void Analys(){
        
  
        ROOT::Experimental::TDataFrame d("Particle","crmc_eposlhc_416075184_p_C_800.root");
        std::vector<float> v;
        using floats = std::array_view<float>;

        auto ptCalc = [&v](floats pxs, floats pys) {
        v. clear(); 
        for (unsigned int i=0;i < pxs.size(); ++i) v.emplace_back(sqrt(px*px + py*py));
        return v;};

        auto dd = d.Define("pt", ptCalc, {"px", "py"});
        auto hpt = dd.Histo1D({"hpt", "pt distribution", 100, 0, 4}, "pt");
        
        [...]    
  
    }

Error:

error: use of undeclared identifier 'px'
        for (unsigned int i=0;i < pxs.size(); ++i) v.emplace_back(sqrt(px*px + py*py));
error: use of undeclared identifier 'py'
        for (unsigned int i=0;i < pxs.size(); ++i) v.emplace_back(sqrt(px*px + py*py));

 *** Break *** segmentation violation



===========================================================
There was a crash.
This is the entire stack trace of all threads:
===========================================================
#0  0x00007f94a35f507a in __GI___waitpid (pid=25205, stat_loc=stat_loc
entry=0x7ffc9f9ed4c0, options=options
entry=0) at ../sysdeps/unix/sysv/linux/waitpid.c:29
#1  0x00007f94a356dfbb in do_system (line=<optimized out>) at ../sysdeps/posix/system.c:148
#2  0x00007f94a419268d in TUnixSystem::Exec (shellcmd=<optimized out>, this=0x1369570) at /opt/root6/root6_src/core/unix/src/TUnixSystem.cxx:2118
#3  TUnixSystem::StackTrace (this=0x1369570) at /opt/root6/root6_src/core/unix/src/TUnixSystem.cxx:2412
#4  0x00007f94a4194c5c in TUnixSystem::DispatchSignals (this=0x1369570, sig=kSigSegmentationViolation) at /opt/root6/root6_src/core/unix/src/TUnixSystem.cxx:3643
#5  <signal handler called>
#6  0x00007f94a07b8254 in clang::CXXRecordDecl::getLambdaCallOperator() const () from /opt/root6/lib/libCling.so
#7  0x00007f949ec26ccc in clang::RecursiveASTVisitor<cling::(anonymous namespace)::StaticVarCollector>::TraverseLambdaExpr(clang::LambdaExpr*, llvm::SmallVectorImpl<llvm::PointerIntPair<clang::Stmt*, 1u, bool, llvm::PointerLikeTypeTraits<clang::Stmt*>, llvm::PointerIntPairInfo<clang::Stmt*, 1u, llvm::PointerLikeTypeTraits<clang::Stmt*> > > >*) () from /opt/root6/lib/libCling.so
#8  0x00007f949ee6b56e in clang::RecursiveASTVisitor<cling::(anonymous namespace)::StaticVarCollector>::TraverseStmt(clang::Stmt*, llvm::SmallVectorImpl<llvm::PointerIntPair<clang::Stmt*, 1u, bool, llvm::PointerLikeTypeTraits<clang::Stmt*>, llvm::PointerIntPairInfo<clang::Stmt*, 1u, llvm::PointerLikeTypeTraits<clang::Stmt*> > > >*) [clone .part.2670] () from /opt/root6/lib/libCling.so

When I try define the variables px, py and don’t work, maybe the way that
I’m defined is wrong.

How I define the variables that are used in function v.emplace_back

Cheers, Andre

Hi Andre,

there is a little mistake in my code. This should work.

    void Analys(){
        
  
        ROOT::Experimental::TDataFrame d("Particle","crmc_eposlhc_416075184_p_C_800.root");
        std::vector<float> v;
        using floats = std::array_view<float>;

        auto ptCalc = [&v](floats pxs, floats pys) {
        v. clear(); 
        for (unsigned int i=0;i < pxs.size(); ++i) v.emplace_back(sqrt(pxs[i]*pxs[i] + pys[i]*pys[i]));
        return v;};

        auto dd = d.Define("pt", ptCalc, {"px", "py"});
        auto hpt = dd.Histo1D({"hpt", "pt distribution", 100, 0, 4}, "pt");
        
        [...]    
  
    }

My issue is that I cannot try out my snippets :slight_smile: In case we continue this thread, could you share on the post a few events of your dataset? A tree with 3 entries would be enough if you are not comfortable sharing the entire rootfile.

Cheers,
D

1 Like

Hi, Danilo,

Thank you very much, It worked!

I change float to double

    void Analys(){
        
        ROOT::Experimental::TDataFrame d("Particle","crmc_eposlhc_416075184_p_C_800.root");
        std::vector<double> v;
        using doubles = std::array_view<double>;
        
        auto ptCalc = [&v](doubles pxs, doubles pys) {
            v. clear(); 
            for (unsigned int i=0;i < pxs.size(); ++i) v.emplace_back(sqrt(pxs[i]*pxs[i] + pys[i]*pys[i]));
            return v;};
            
            auto dd = d.Define("pt", ptCalc, {"px", "py"});
            auto hpt = dd.Histo1D({"hpt", "pt distribution", 100, 0, 5}, "pt");
        
        

        //drawing
        auto c1 = new TCanvas("c1", "c1", 10, 10, 700, 500);
        c1->SetGrid(1,1);
        c1->SetLogx(0); // 0 == scale without Log, 1 == scale with Log
        c1->SetLogy(0);
        hpt->GetYaxis()->SetTitle("dN/dpt");
        hpt->GetXaxis()->SetTitle("pt");
        hpt->DrawClone();
            
    }

The Graph:

The File crmc_eposlhc_416075184_p_C_800.root was generated by the EPOS LHC.

Danilo, The file being sharing through the your email in one folder in my dropbox , because the file too big, I can’t to do the upload of the file.

Thanks!
Great job!

2 Likes

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.