GetPtr work differently when it is return from fuction (maybe a bug)

I am using rdataframe and am confused by the result of GetPtr() from the following snippet

{
    ROOT::RDataFrame d(2000000);
    auto dd = d.Define("x", [](){ return gRandom->Uniform(-5.,  5.); });

    auto getTH1DFunc = [&dd]() {
        return
            dd.Histo1D({"", "", 100, -5, 5}, {"x"});
    };

    auto getTH1DPtrFunc = [&dd]() {
        return
            dd.Histo1D({"", "", 100, -5, 5}, {"x"}).GetPtr();
    };

    auto getTH1DPtrFunc2 = [&dd]() -> TH1D* {
        return
            dynamic_cast<TH1D *>(dd.Histo1D({"", "", 100, -5, 5}, {"x"}).GetPtr());
    };

    cout << "Histo1D direct Print:           ";
    getTH1DFunc()->Print();
    cout << "Histo1D.GetPtr() direct Print:  ";
    getTH1DFunc().GetPtr()->Print();
    cout << "Func returned Histo1D:          ";
    getTH1DPtrFunc()->Print();
    cout << "Func returned Histo1D and cast: ";
    getTH1DPtrFunc2()->Print();
}

The above snippet gives the result as

Histo1D direct Print:           TH1.Print Name  = , Entries= 2000000, Total sum= 2e+06
Histo1D.GetPtr() direct Print:  TH1.Print Name  = , Entries= 2000000, Total sum= 2e+06
Func returned Histo1D:          OBJ: TObject    TObject Basic ROOT object
Func returned Histo1D and cast: OBJ: TObject    TObject Basic ROOT object

Why does the pointer of GetPtr() lose the class information when it is returned by a function (lambda and normal function give the same results)?

Is it a bug or a mistake in my cpp code?

My root version is

ROOT Version: 6.26/04
Built for linuxx8664gcc on Jun 07 2022, 16:01:16
From tags/v6-26-04@v6-26-04

Thank you very much!

I also check the results of ClassName and Class_Name as

    cout << "Histo1D.ClassName:                 " << getTH1DFunc()->ClassName() << endl;
    cout << "Histo1D.Class_Name:                " << getTH1DFunc()->Class_Name() << endl;

    cout << "Func returned Histo1D.ClassName:   " << getTH1DPtrFunc()->ClassName() << endl;
    cout << "Func returned Histo1D.Class_Name:  " << getTH1DPtrFunc()->Class_Name() << endl;

The results are

Histo1D.ClassName:                 TH1D
Histo1D.Class_Name:                TH1D
Func returned Histo1D.ClassName:   TObject
Func returned Histo1D.Class_Name:  TH1D

It also confused me following the discussion about the ClassName and Class_Name in the https://root-forum.cern.ch/t/classname-vs-class-name/51479

Hi @qiutum ,

classic C++ lifetimes shenanigans!

Here you are returning a non-owning pointer to the underlying histogram, so the underlying histogram object gets destructed at the end of the function scope, when the RResultPtr<TH1D> returned by the Histo1D call (which is owning) goes out of scope.

Simplest solution: directly return and pass around RResultPtr<T>s instead of T*s:

auto getTH1DPtrFunc = [&dd] {
    return dd.Histo1D({"", "", 100, -5, 5}, {"x"});
};

This has a huge other advantage: if you call GetPtr after each Histo1D (or other actions), RDF is forced to run one event loop per action to produce the result right there and then, in order to return a valid pointer from GetPtr. If you return RResultPtrs instead, that does not trigger the event loop, and it will be triggered only upon first access to any of the results, producing all of them in the same event loop. That should provide a huge speed-up.

Cheers,
Enrico

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.