TDataFrame Long64_t values

Hi there,

I have been playing with the new TDataFrame. It works well with double variables. How do I use it with Long64_t branches? Task: find minimum and maximum value.

The following programs shows the problem:

#include <ROOT/TDataFrame.hxx>
#include <TTree.h>
#include <iostream>
#include <limits>
#include <memory>
#include <type_traits>

using namespace std;

void tdf_test() {
    auto t = make_unique<TTree>("t", "t");
    Long64_t l;
    auto lbr = t->Branch("l", &l);
    for (Long64_t l_test :
         {numeric_limits<Long64_t>::max(), numeric_limits<Long64_t>::max() - 1}) {
        l = l_test;
        t->Fill();
    }
    ROOT::Experimental::TDataFrame df(*t, {"l"});
    auto lo = df.Min();
    auto hi = df.Max();
    cout << "min " << *lo << "\nmax " << *hi << '\n'
         << boolalpha << "Min and max are the same: " << (*lo == *hi) << '\n'
         << "result is Long64_t: "
         << (is_same<remove_reference_t<decltype(*lo)>, Long64_t>::value) << '\n'
         << "result is Double_t: "
         << (is_same<remove_reference_t<decltype(*lo)>, Double_t>::value) << '\n';
}

Hi,
thanks for trying out TDF, we really need more user feedback :slight_smile:

The documentation of Min and Max is lacking – I thought I had updated it but evidently I did not:
Min, Max and Sum return a proxy to a T, where the type T is:

  • double when no template parameter is specified (like in your case)
  • the type of the column when it is specified as a template parameter (e.g. Min<Long64_t>())
  • the value_type of an STL container if the template parameter is specified and is a container type (in this case Min,Max and Sum work on each element of the container for each event)

So you should write Min<Long64_t>() and Max<Long64_t>() to get the correct type out – otherwise we have to infer it from the TTree at runtime which gives us no choice but return a sensible default type at compile-time, i.e. double.
If this does not answer your question, what version of ROOT are you on and what is the output?

Cheers,
Enrico

1 Like

Thanks, that works!

I was suspecting something like that but didn’t know where to find Min and Max. I have seen the structs Max and Min in TDFUtils.hxx where they were just empty -> fist I was looking at TDFInterface.hxx but that was also not the correct place. So I thought better ask here. (I should have looked at intellisense… -> but then I need to know what is TInferType)

ROOT version: master from some days (probably weeks?) ago + some custom TMVA patches :slight_smile:

Maybe a follow-up:
when calling *df.Min(), *df.Max(), *df.Mean(), does ROOT loop 3 times over the tree? Can I loop just once?

Hi,
the information I gave you should have been written here, in the Min, Max documentation. I will fix this as soon as possible.

If you write it like that, it’s 3 event loops. If you write it like

auto min = df.Min();
auto max = df.Max();
auto mean = df.Mean();
*min, *max, *mean;

it’s one event loop. In general you should register all the computations you want to do before accessing their results. The section “Executing multiple actions in the same event loop” of the TDF user guide should explain this clearly (at least in our intentions :slight_smile: ).

Cheers,
Enrico

ok, yes. I remember reading that. My main problem is always finding the correct documentation page. When googling for ROOT specific things you sometimes still end up in the ROOT 5.20x or even ancient html302 documentation. That’s a bit annoying. (of course there is no TDataFrame in these version, but still…)

And for TDF, all the templates make it a bit harder to understand what is going on. Especially if you read

template<typename Proxied>
template<typename T = TDFDetail::TInferType>
TResultProxy<TDFInternal::MaxReturnType_t<T> > ROOT::Experimental::TDF::TInterface< Proxied >::Max	(	std::string_view 	columnName = ""	)	

An example using the template parameters will make things simpler, especially because not all pysicists (and/or other ROOT users) are super familiar with templated code. Especially: what is the “Proxied”, what is “T”, where does one have to put the template parameter and so on (for me, it was just “not finding the docu page” and not properly looking at the help in my editor).

Yes I agree it’s not nice to have such complicated signatures in the docs. Not much I can do about that I’m afraid, but as you suggest some example snippets in the docs might go a long way.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.