Support for TDataFrame::Profile1D(2D) in pyROOT

Dear experts,

while Histo1D,2D,3D work just fine in python, calling something like:

TDataFrame::Profile1D(("pr","",100,0,1,0,1), "brA", "brB") 

ends in error:

TypeError: can not resolve method template call for 'Profile1D'

the same happens for Profile2D. Am I typing the command wrong? if so which is the
correct way to call the method in python? (C++ version on the same tree runs fine)

Thank you,

simone

Spec: ROOT master, gcc 7.1.1, python 3.6.2

2 Likes

Hi Simone,
your command is just fine, or better will be, since the feature is simply not there yet.
Support for Profile{1D,2D} is coming to master soon and will be part of the upcoming 6.12 ROOT release.

Thank you for trying out TDF, do not hesitate to let us know what else is missing for PyROOT users! :smiley:

Cheers,
Enrico

1 Like

Hi Simone,

first of all: thanks for having a look to the bleeding edge and giving feedback!

As Enrico says: it’s in the making. We implemented histograms in master and, given that no conceptual challenge was ahead of us for profiles, we decided to tackle more difficult aspects such as caching of datasets’ columns in memory for faster (re)processing (see https://root.cern/doc/master/classROOT_1_1Experimental_1_1TDF_1_1TInterface.html#a68fbd9949eb20ddc759e1e93a0663782): the November release, 6.12, is really behind the corner and we need careful prioritisation.

We’ll do our best to squeeze into the release a PyROOT friendly way of dealing with profiles as well as a more powerful way to define columns and filters.

Cheers,
D

Hi Simone,

just a heads-up.
The code was added the to make profiles work as the histograms do with TDataFrame in PyROOT.
The documentation should show the new signatures starting from tomorrow morning early (e.g. https://root.cern/doc/master/classROOT_1_1Experimental_1_1TDF_1_1TInterface.html#a4a5cd8f64aacf0894e1851580e3c16c1)

Cheers,
D

1 Like

Hi Danilo,

thank you for the update. Is support for friend trees in TDF going to be part of 6.12 as well?

simone

Hi Simone,

in sequential, therefore without implicitMT activated, friend trees are already supported. For example:

using namespace ROOT::Experimental;
void a()
{
   TDataFrame tdf1(8);
   tdf1.Define("a", [](){return 1;}).Snapshot("t1","f1.root");

   TDataFrame tdf2(8);
   tdf2.Define("b",[](){return 2;}).Snapshot("t2","f2.root");

   TFile f1("f1.root");
   auto t1 = (TTree*) f1.Get("t1");

   t1->AddFriend("t2", "f2.root");

   TDataFrame tdf(*t1);
   tdf.Foreach([](int a, int b){cout << a << " " << b << endl;}, {"a", "t2.b"});
}

We’ll see what additional features relared to friend trees can be squeezed into 6.12.

But just to understand better your use case, how are you using friend trees? What is the problem they solve for you in the work you are doing?

Cheers,
D

Hi Danilo,

my use case is the following: I have friend trees produced by an analysis software (or better by a software that deals with the reconstruction of test beam data) on top of these trees I have a python script that runs the analysis/makes plots. Everything works fine but it’s quite slow and not suitable for a prompt analysis (i.e. DQM), although I can speed thinks up using TTreeFormula, TDataFrame could be a great way to achieve speed while keeping the code simple, clean and generic.

simone

Hi Simone,

thanks for the explanation.
One thing that can be done now, if MT is not crucial, is to deal with friend trees as in the example. We’ll see what to do next.

Cheers,
D

Hi Danilo,

indeed if I switch off implicit MT it works perfectly. Thank you a lot!

simone

Hi Simone,

great.
This breaks, even if slightly, the “universality” of TDF in the sequential and parallel case. For us, up to now, this has been a must and I think it should continue to be. In other words, we are not giving up here, just prioritising and keeping you advance with your DQM code :slight_smile:

Cheers,
D

Hi Danilo,

are there known problems in mixing friends and the issue we discussed in this other topic:

When I try your recipe it works for arrays in the main tree but not for those in the friends. The Define call doesn’t work also for non-array variables stored in the friends.

Thank you again for the support,

simone

Hi,
yes support of friend trees is not completely there. I just opened PR 1135 to resolve the most obvious issue, that is what you are probably seeing now.

I will wait for the PR to be merged. Thank you!

Let’s say we’ll ping this thread when we think support for friend trees is good enough? :smile:

Ok ok, I won’t complain till then :slight_smile:

1 Like

Hi @simonepigazzini,
we are testing friend trees support and we cannot reproduce failures when reading arrays from friend trees.
Can you copy-paste the problematic line and the output of TTree::Print for the corresponding branch?

This works for us:

   TFile f1(kFile1);
   TTree *t1 = static_cast<TTree *>(f1.Get("t"));
   t1->AddFriend("t3", kFile3);
   TDataFrame d(*t1);

   auto checkArr = [](std::array_view<float> av) {
      for (auto x : av)
         std::cout << x << " ";
      std::cout << "\n";
   };
   d.Foreach(checkArr, {"arr"});

Cheers,
Enrico

Hi Enrico,

I’m travelling now and I can’t post an example but the main difference is that I tried to access columns in the friend tree using the Define action.

cheers,

simone

Hi,
that was a good clue! Turns out we do not check friend trees for branch names used in expression strings.
E.g. this works

d.Define("x", [](double f) { return f*2; }, {"friendBranch"});

but this (currently) does not

d.Define("x", "friendBranch*2");

Work in progress!

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Commenting on this closed topic to let @simonepigazzini know that master now has support for TDataFrame+friend trees, both in single- and multi-thread execution. Please let us know if you encounter any problem :slight_smile:

Cheers,
Enrico

1 Like