RDataFrame snapshot automatic type deduction performance


ROOT Version: 6.16
Platform: Not Provided
Compiler: Not Provided


I am trying to do a few thousand snapshots of a dataframe with different filters. The snapshots do all write the same columns. From following code I get that ~1s per snapshot is spend because of the automatic column type deduction. I would like to use the automatic deduction, but the performance is a bottleneck for me.

Is there a way to do this deduction only once and somehow apply it for all snapshots?

ROOT::RDF::RNode defines(ROOT::RDF::RNode node, int ncols){
    if(ncols > 0){
        return defines(node.Define("x"+std::to_string(ncols), [ncols](){return ncols;}), ncols-1);
    }
    else{
        return node;
    }
}

int main(){
    ROOT::RDataFrame df_orig(10);
    auto df = defines(df_orig, 3);
    std::time_t start = std::time(0);

    ROOT::RDF::RSnapshotOptions opts;
    opts.fLazy = true;
    using SnapRet_t = ROOT::RDF::RResultPtr<ROOT::RDF::RInterface<ROOT::Detail::RDF::RLoopManager>>;
    std::vector<SnapRet_t> rets;
    
    start = std::time(0);
    for (auto i = 0; i < 5; ++i){
        rets.emplace_back(df.Snapshot<int,int,int>("t", "f" + std::to_string(i) + ".root", {"x1","x2","x3"}, opts));
    }
    std::cout << "time with template: " << std::time(0) - start << "s" << std::endl;

    start = std::time(0);    
    for (auto i = 0; i < 5; ++i){
        rets.emplace_back(df.Snapshot("t", "f" + std::to_string(i) + ".root", {"x1","x2","x3"}, opts));
    }
    std::cout << "time without template: " << std::time(0) - start << "s" << std::endl;

    return 0;
}

Hi,

thanks for your report. We are aware of this performance degradation pattern and will implement a solution asap, not sure we’ll make it in time for reease 6.18.

Now, to address today your concrete problem. How many columns are you snapshotting? Is it an option to explicitly write the types, perhaps for part of the snapshots?

Cheers,
D

Hi,

thanks!

I am usually snapshotting 5-15 columns. All snapshots have the same columns, so I would only have to hardcode the column types once. The problem is that the amount/type of columns depends on a configuration file.

The config is read from a json in python. From this some std::vector<string> are filled which are given to a c++ class. This class interacts with the dataframe and creates all the snapshots.

So one way might be checking the column types in python once and then dynamically compiling the c++ with the right template snapshot.

Cheers,
Christian

Hi Christian,

thanks for claryfying the context.
What about jitting via gInterpreter->Declare templated functions which propagate the types to the snapshot call and take in input the list of columns?
Those would be jitted once and used by your setup a few thousands times therewith eliminating the problem.
I can help you through this if something is not clear.

Cheers,
Danilo

Hi Danilo,

Ok, I think that is exactly what I need.

So far I tried this:

#include <ROOT/RDataFrame.hxx>
#include "ROOT/RDF/RInterface.hxx"
#include <iostream>

ROOT::RDF::RNode defines(ROOT::RDF::RNode node, int ncols){
    if(ncols > 0){
        return defines(node.Define("x"+std::to_string(ncols), [ncols](){return ncols;}), ncols-1);
    }
    else{
        return node;
    }
}

int snapshotperf(){
    ROOT::RDataFrame df_orig(10);
    auto df = defines(df_orig, 3);
    std::time_t start = std::time(0);

    ROOT::RDF::RSnapshotOptions opts;
    opts.fLazy = true;
    using SnapRet_t = ROOT::RDF::RResultPtr<ROOT::RDF::RInterface<ROOT::Detail::RDF::RLoopManager>>;
    std::vector<SnapRet_t> rets;

    std::vector<std::string> columnnames = {"x1", "x2", "x3"};
    std::vector<std::string> columntypes = {"int", "int", "int"};
    std::string template_expr("<");
    for(int i = 0; i < columntypes.size(); i++){
        template_expr+=columntypes[i];
        if(i!= columntypes.size()-1)
            template_expr+=",";
    }
    template_expr+=">";


    std::string declare_expr(
        "ROOT::RDF::RResultPtr<ROOT::RDF::RInterface<ROOT::Detail::RDF::RLoopManager>> make_snap(ROOT::RDF::RNode df, std::string treename, std::string fname, std::vector<std::string> columnnames, ROOT::RDF::RSnapshotOptions opts){"
        "return df.Snapshot"+template_expr+"(treename, fname, columnnames, opts);"
        "}");

    gInterpreter->Declare(declare_expr.c_str());

    TInterpreterValue *tiv = gInterpreter->CreateTemporary();
    std::string eval_str(
            "[](ROOT::RDF::RNode df, std::string treename, std::string fname, std::vector<std::string> columnnames, ROOT::RDF::RSnapshotOptions opts)"
            " {return make_snap(df, treename, fname, columnnames, opts);};"
        );
    gInterpreter->Evaluate(eval_str.c_str(), *tiv);

    using functype = std::function<SnapRet_t(ROOT::RDF::RNode,std::string,std::string,std::vector<std::string>, ROOT::RDF::RSnapshotOptions)>;
    functype make_snap = *(functype*)tiv->GetAsPointer();
    for (auto i = 0; i < 5; ++i){
        SnapRet_t res = make_snap(df, "t", "f" + std::to_string(i) + ".root", columnnames, opts);
        rets.emplace_back(res);
    }

    return df.Count().GetValue();
}

So far this gives me a segmentation violation. Do you know what I did wrong?

Cheers,
Christian

 *** Break *** segmentation violation
[/usr/lib/system/libsystem_platform.dylib] _sigtramp (no debug info)
[<unknown binary>] (no debug info)
[<unknown binary>] (no debug info)
[<unknown binary>] (no debug info)
[<unknown binary>] (no debug info)
[/Users/Christian/work/root/root_v6_16/lib/libCling.so] cling::Interpreter::RunFunction(clang::FunctionDecl const*, cling::Value*) (no debug info)
[/Users/Christian/work/root/root_v6_16/lib/libCling.so] cling::Interpreter::EvaluateInternal(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, cling::CompilationOptions, cling::Value*, cling::Transaction**, unsigned long) (no debug info)
[/Users/Christian/work/root/root_v6_16/lib/libCling.so] cling::MetaSema::actOnxCommand(llvm::StringRef, llvm::StringRef, cling::Value*) (no debug info)
[/Users/Christian/work/root/root_v6_16/lib/libCling.so] cling::MetaParser::isXCommand(cling::MetaSema::ActionResult&, cling::Value*) (no debug info)
[/Users/Christian/work/root/root_v6_16/lib/libCling.so] cling::MetaParser::isCommand(cling::MetaSema::ActionResult&, cling::Value*) (no debug info)
[/Users/Christian/work/root/root_v6_16/lib/libCling.so] cling::MetaProcessor::process(llvm::StringRef, cling::Interpreter::CompilationResult&, cling::Value*, bool) (no debug info)
[/Users/Christian/work/root/root_v6_16/lib/libCling.so] HandleInterpreterException(cling::MetaProcessor*, char const*, cling::Interpreter::CompilationResult&, cling::Value*) (no debug info)
[/Users/Christian/work/root/root_v6_16/lib/libCling.so] TCling::ProcessLine(char const*, TInterpreter::EErrorCode*) (no debug info)
[/Users/Christian/work/root/root_v6_16/lib/libCling.so] TCling::ProcessLineSynch(char const*, TInterpreter::EErrorCode*) (no debug info)
[/Users/Christian/work/root/root_v6_16/lib/libCore.6.16.so] TApplication::ExecuteFile(char const*, int*, bool) (no debug info)
[/Users/Christian/work/root/root_v6_16/lib/libRint.6.16.so] TRint::ProcessLineNr(char const*, char const*, int*) (no debug info)
[/Users/Christian/work/root/root_v6_16/lib/libRint.6.16.so] TRint::Run(bool) (no debug info)
[/Users/Christian/work/root/root_v6_16/bin/root.exe] main (no debug info)
[/usr/lib/system/libdyld.dylib] start (no debug info)

Ok I found a working solution.
Thanks a lot for pointing me to this gInterpreter->Declare.

#include <ROOT/RDataFrame.hxx>
#include "ROOT/RDF/RInterface.hxx"
#include <iostream>

ROOT::RDF::RNode defines(ROOT::RDF::RNode node, int ncols){
    if(ncols > 0){
        return defines(node.Define("x"+std::to_string(ncols), [ncols](){return ncols;}), ncols-1);
    }
    else{
        return node;
    }
}

int snapshotperf(){
    ROOT::RDataFrame df_orig(10);
    auto df = defines(df_orig, 3);
    std::time_t start = std::time(0);

    ROOT::RDF::RSnapshotOptions opts;
    opts.fLazy = true;
    using SnapRet_t = ROOT::RDF::RResultPtr<ROOT::RDF::RInterface<ROOT::Detail::RDF::RLoopManager>>;
    std::vector<SnapRet_t> rets;

    std::vector<std::string> columnnames = {"x1", "x2", "x3"};
    std::vector<std::string> columntypes = {"int", "int", "int"};
    std::string template_expr("<");
    for(int i = 0; i < columntypes.size(); i++){
        template_expr+=columntypes[i];
        if(i!= columntypes.size()-1)
            template_expr+=",";
    }
    template_expr+=">";

    start = std::time(0);
    std::string declare_expr(
        "ROOT::RDF::RResultPtr<ROOT::RDF::RInterface<ROOT::Detail::RDF::RLoopManager>> make_snap"
        "(ROOT::RDF::RNode df, std::string treename, std::string fname, std::vector<std::string> columnnames, ROOT::RDF::RSnapshotOptions opts){"
        "return df.Snapshot"+template_expr+"(treename, fname, columnnames, opts);"
        "}");

    gInterpreter->Declare(declare_expr.c_str());
    std::cout << "time to declare " << std::time(0)-start << std::endl;

    start = std::time(0);
    auto make_snap = (SnapRet_t (*)(ROOT::RDF::RNode,std::string,std::string,std::vector<std::string>, ROOT::RDF::RSnapshotOptions)) gInterpreter->ProcessLine("make_snap");
    std::cout << "time to get function " << std::time(0)-start << std::endl;

    start = std::time(0);
    for (auto i = 0; i < 5; ++i){
        SnapRet_t res = make_snap(df, "t", "f" + std::to_string(i) + ".root", columnnames, opts);
        rets.emplace_back(res);
    }
    std::cout << "time to create snapshots " << std::time(0)-start << std::endl;

    return df.Count().GetValue();
}
1 Like

Great!
Thanks for sharing it.

Cheers,
D

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Dear @cwiel ,

I am taking the liberty of bringing up this old topic again because we have just finished an improvement of the capability of the RDataFrame Snapshot such that it won’t be necessary anymore to specify any template arguments at all. At the same time, this will also not require any JIT-compiling. That practically means that you get the benefit of a simpler API (i.e. no need to care about template arguments) with a much, much faster runtime performance.

The change has just been merged to the development branch of ROOT, so it’s going to be available in the next ROOT version 6.38 scheduled for end of this year. In the meanwhile, if you have access to an LCG release via CVMFS, for example using lxplus, you can already see this in action. You can source the environment via e.g.

source /cvmfs/sft.cern.ch/lcg/views/dev3/latest/x86_64-el9-gcc13-opt/setup.sh

Cheers,
Vincenzo