IMT with RooDataSet in RDataFrame::Foreach

Hi all,
I am trying to teach my RDataFrame scripts to create RooDataSets, use TMVA::Reader and potentially other things.
In a first prototype, in which I obviously did not think about thread safety and potentially other things that could break, the function

auto save_in_ds = [&arglist_vars,&ds] (const std::vector<double>& rdf_vars) -> void {
      auto ctr = 0u;
      auto vit = arglist_vars.createIterator();
      while(auto rrv = static_cast<RooRealVar*>(vit->Next()))
        rrv->setVal(rdf_vars[ctr++]);
      ds.add(arglist_vars);
    };

is used in a Foreach call to my RDataFrame object:

df.Foreach(save_in_ds,{"rdf_vars_for_ws"});

Here, arglist_vars is a RooArgList and the vector rdf_vars_for_ws is created by df.Define together with this RooArgList earlier.

I tested this function with 3 different IMT settings: i) off ii) on, 1 thread iii) on, 4 threads, each with a different outcome. Setting i) worked, setting ii) broke with double free or corruption at a relatively late stage (30-60 % in the loop) and setting iii) breaks almost immediately. From all I can judge, the backtrace points towards a MT issue. As I said, I didn’t think about thread safety at all so far, but I believe that that’s the issue here. If that’s true, I would like to ask you if RooDataSets and other objects can be made thread safe easily?

If you want more info like reproducers or debugging output, please let me know.

Cheers,
Marian


ROOT Version: 6.17.01
Platform: x86_64-slc6-gcc8-opt
Compiler: g++ 8.2.0


Hi Marian,
your save_in_ds lambda is not thread-safe: multiple threads will be writing to the ds variable at the same time. You can either use a different RooDataSet per thread or put a mutex around the unsafe calls.

I’m not sure whether createIterator() is also thread-unsafe, but the stacktrace of the crash will tell you :slight_smile:

Cheers,
Enrico

Hi Enrico,
Using a mutex did the trick, thanks a lot! I didn’t expect it to be that easy :slight_smile:
The performance is currently similar to the single-threaded, so i will have a look if ForeachSlot can give a boost.

Thanks,
Marian

If somebody else runs into a similar problem, the solution I found looks like this

    const auto dsname = configtree.get("dsname","ds");
    RooDataSet ds(dsname.data(),dsname.data(),arglist_vars);
    std::mutex dsMutex;

    auto save_in_ds = [&arglist_vars,&ds,&dsMutex] (const std::vector<double>& rdf_vars) -> void {
      std::lock_guard<std::mutex> l(dsMutex);
      auto ctr = 0u;
      auto vit = arglist_vars.createIterator();
      while(auto rrv = static_cast<RooRealVar*>(vit->Next()))
        rrv->setVal(rdf_vars[ctr++]);
      ds.add(arglist_vars);
    };
    df.Foreach(save_in_ds,{"rdf_vars_for_ws"});

Hi Marian,
with that mutex there, calls to save_in_ds will run sequentially. If your program spends most of its time in there, you will get no speed-up from multi-threading (actually you could run a bit slower than with 1 thread).
If save_in_ds is a fraction of your runtime, and the rest can be run in parallel, Amdhal’s law applies.

Cheers,
Enrico

Hi Enrico,
writing the code with ForeachSlot seems to be harder than i thought. I now have the following code:

    std::mutex dsMutex;
    auto save_in_ds = [&alists,&dsets,&dsMutex] (unsigned int thread, const std::vector<double>& rdf_vars) -> void {
      std::lock_guard<std::mutex> l(dsMutex);
      auto ctr = 0u;
      auto vit = alists[thread].createIterator();
      while(auto rrv = static_cast<RooRealVar*>(vit->Next()))
        rrv->setVal(rdf_vars[ctr++]);
      dsets[thread].add(alists[thread]);
    };
    df.ForeachSlot(save_in_ds,{"rdf_vars_for_ws"});

Where alists is a nthreads-sized vector of RooArgList holding unique_ptr to RooRealVars (i tested that each of them points to a different address). When i set up the RooArgLists, I also fill a vector of RooDataSets (dsets).
In principle such a setup wouln’t need a mutex, correct? With or without it, the code segfaults and the error points to the last line of the lambda.
Is there something obvious that i’m doing wrong here?
Cheers,
Marian

Hi,
I guess a debugger will point you right at the problem. If it’s the last line of the lambda, my best guess would be that thread is higher than the dimension of dsets (out-of-bound access).

Cheers,
Enrico

Hi Enrico,
You were right again :slight_smile:
I did not realize that thread in my function goes up to ROOT::GetImplicitMTPoolSize(), which is 10 in my case. I thought that it is only 4 if i set

ROOT::EnableImplicitMT(4); 

Thanks again and sorry for the noise,
Marian

1 Like

In master, and from the next ROOT version, the always-correct number is returned by RDataFrame::GetNSlots

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.