Random number generation with RDataFrame

Hi all,

I have observed some odd behavior when using a random number generator with RDataFrame that produces strange artifacts in my results. Here is a simplified example:

TRandom3 *rndm;

double GetRNG() { return rndm→Gaus(); }

void func(int n) {

    rndm = new TRandom3(0);

    ROOT::EnableImplicitMT();

    auto df = ROOT::RDataFrame(n).Define(“x”, GetRNG);

    auto h = df.Histo1D({“h”, “h”, 1000, -10, 10}, {“x”});

    h→DrawClone();

}

When MT is enabled, there are spikes in the resulting distribution that are not present in single-threaded mode. I suspect there’s some sort of race condition related issue from each thread trying to access the same random number generator, but am not sure how to remedy this. Is there a way to give each thread in MT mode its own random number generator to avoid this problem?

Thank you!

1 Like

TRandom3 is probably not thread-safe.

See https://stackoverflow.com/questions/8813592/c11-thread-safety-of-random-number-generators

A workaround would do is define a static thread_local TRandom3 object in each thread

Are there any examples of this with RDataFrame? I saw this thread, where it was recommended to use DefineSlot to access an array of random number generators, but I’m still fairly new to C++ and don’t understand how to implement that.

@FoxWise could you share a full example?

Related: Problem with RDataFrame multithreading - #3 by ajaydeo

Maybe @vpadulan can help

Hi @nicconflo, @ferhue,

Sure, here is the full reproducible with the example how to define several random engines :slight_smile:

Reproducible

Run it as root test.cpp.
I have tested it with ROOT 6.38

//test.cpp

struct RandomEngines{
    RandomEngines(){
        //Assuming you have <= 64 cores
        for (int i=0; i<64; i++) engines.push_back( TRandom3( rd() ) );
    }
    std::vector< TRandom3 > engines;

    // for random seeding of several random engines
    std::random_device rd;
};

RandomEngines engines;

double GetRNG(unsigned int slot) { return engines.engines[slot].Gaus(); }
double GetRNG_single_engine() { return engines.engines[0].Gaus(); }


void test(){
    auto* c1 = new TCanvas("c1", "c1", 1500, 500);
    c1->Divide(3, 1);
    
    ROOT::DisableImplicitMT();
    auto df1 = ROOT::RDataFrame(10000000).Define("x", GetRNG_single_engine);
    auto h1 = df1.Histo1D({"h1", "single core", 1000, -4, 4}, {"x"});    
    c1->cd(1);
    h1->DrawClone();

    
    ROOT::EnableImplicitMT(16);
    auto df3 = ROOT::RDataFrame(10000000).Define("x", GetRNG_single_engine);
    auto df4 = ROOT::RDataFrame(10000000).Define("x", GetRNG, {"rdfslot_"});
    
    auto h3 = df3.Histo1D({"h3", "single random engine", 1000, -4, 4}, {"x"});
    auto h4 = df4.Histo1D({"h4", "N random engines", 1000, -4, 4}, {"x"});
    
    c1->cd(2);
    h3->DrawClone();
    c1->cd(3);
    h4->DrawClone();
    std::cout<<std::fixed<<std::setprecision(3)<<"Perfect                    : "<<"0.000"<<" +- "<<"1.000"<<std::endl;
    std::cout<<std::fixed<<std::setprecision(3)<<"Single thread              : "<<h1->GetMean()<<" +- "<<h1->GetStdDev()<<std::endl;
    std::cout<<std::fixed<<std::setprecision(3)<<"Single random engine   (MT): "<<h3->GetMean()<<" +- "<<h3->GetStdDev()<<std::endl;
    std::cout<<std::fixed<<std::setprecision(3)<<"Several random engines (MT): "<<h4->GetMean()<<" +- "<<h4->GetStdDev()<<std::endl;   
}

Output

Warning in <ROOT_TImplicitMT_DisableImplicitMT>: Implicit multi-threading is already disabled
Perfect                    : 0.000 +- 1.000
Single thread              : -0.000 +- 0.999
Single random engine   (MT): 0.001 +- 0.992
Several random engines (MT): -0.000 +- 0.999

You can see the weird spikes, but also, that the final STD is significantly off the value of the Gaus you’d expect.

You should also note that the documentation for TRandom3 actually suggests to use std::mt19937 as the random engine, because (some smart reasons).
Changing to std::mt19937 is rather easy. See below :slight_smile:

Reproducible with std::mt19937

//test.cpp

struct RandomEngines{
    RandomEngines(){
        //Assuming you have <= 64 cores
        for (int i=0; i<64; i++) engines.push_back( std::mt19937( rd() ) );
    }
    std::vector< std::mt19937 > engines;

    // for random seeding of several random engines
    std::random_device rd;
};

RandomEngines engines;

std::normal_distribution<double> gaus(0., 1.);
double GetRNG(unsigned int slot) { return gaus(engines.engines[slot]); }
double GetRNG_single_engine() { return gaus(engines.engines[0]); }

void test(){
    auto* c1 = new TCanvas("c1", "c1", 1500, 500);
    c1->Divide(3, 1);
    
    ROOT::DisableImplicitMT();
    auto df1 = ROOT::RDataFrame(10000000).Define("x", GetRNG_single_engine);
    auto h1 = df1.Histo1D({"h1", "single core", 1000, -4, 4}, {"x"});    
    c1->cd(1);
    h1->DrawClone();

    
    ROOT::EnableImplicitMT(16);
    auto df3 = ROOT::RDataFrame(10000000).Define("x", GetRNG_single_engine);
    auto df4 = ROOT::RDataFrame(10000000).Define("x", GetRNG, {"rdfslot_"});
    
    auto h3 = df3.Histo1D({"h3", "single random engine", 1000, -4, 4}, {"x"});
    auto h4 = df4.Histo1D({"h4", "N random engines", 1000, -4, 4}, {"x"});
    
    c1->cd(2);
    h3->DrawClone();
    c1->cd(3);
    h4->DrawClone();
    std::cout<<std::fixed<<std::setprecision(3)<<"Perfect                    : "<<"0.000"<<" +- "<<"1.000"<<std::endl;
    std::cout<<std::fixed<<std::setprecision(3)<<"Single thread              : "<<h1->GetMean()<<" +- "<<h1->GetStdDev()<<std::endl;
    std::cout<<std::fixed<<std::setprecision(3)<<"Single random engine   (MT): "<<h3->GetMean()<<" +- "<<h3->GetStdDev()<<std::endl;
    std::cout<<std::fixed<<std::setprecision(3)<<"Several random engines (MT): "<<h4->GetMean()<<" +- "<<h4->GetStdDev()<<std::endl;   
}

Output of the reproducible with std::mt19937

Warning in <ROOT_TImplicitMT_DisableImplicitMT>: Implicit multi-threading is already disabled
Perfect                    : 0.000 +- 1.000
Single thread              : 0.000 +- 1.000
Single random engine   (MT): 0.001 +- 1.014
Several random engines (MT): -0.000 +- 0.999

What I find notably interesting, is that you don’t see any spikes with std::mt19937 even if using a single engine with MT enabled! So these spikes is somehow connected to using TRandom3 exactly.

Nevertheless, as you can see by standard deviations, it still produces wrong results! So you indeed need several random engine, for each thread.

I hope that helps :slight_smile:

1 Like

Thanks for the wonderful analysis (worth becoming an official ROOT tutorial if you want to submit a PR)

An additional (simpler solution) might be:

double GetRNG_threadlocal_engine() {
   static thread_local TRandom3 r3(std::hash<std::thread::id>{}(std::this_thread::get_id())); /* you can also use rand() as seed instead*/
   return r3.Gaus();
}

see modified script attached:
test.cpp (2.0 KB)

This way, you don’t need to predeclare an struct/array of engines explicitly.

(DefineSlot instead of Define might be also useful but I am not sure how to use it.)

I think DefineSlot() is just a simple wrapper, because from the API documentation of DefineSlot():

int function(unsigned int, double, double);
// these two calls are equivalent
df.Define("x", function, {"rdfslot_", "column1", "column2"})
df.DefineSlot("x", function, {"column1", "column2"})

So my example also works with DefineSlot by simply changing:

    auto df = ROOT::RDataFrame(10000000).Define("x", GetRNG, {"rdfslot_"});
    // should become
    auto df = ROOT::RDataFrame(10000000).DefineSlot("x", GetRNG);

I think your solution would need to slightly change to have unsigned int as the first input arg in order to use DefineSlot()

I can open a PR for the example :slight_smile:

1 Like

Good point. In that case my solution would become with DefineSlot:

double GetRNG_threadlocal_engine(unsigned int slot) {
   static thread_local TRandom3 r3(slot); /* you can also use rand() as seed instead*/
   return r3.Gaus();
}
1 Like