Sample from discrete distribution

ccorti · March 30, 2020, 9:21am

Hi,

I have a complicated discrete PDF from which I want to draw random numbers.
I tried to use the usual approach of defining a TF1 and then either TH1::FillRandom or TF1::GetRandom, but I don’t get the correct result.
Here’s the PDF class:

struct WattsStrogatz {
    double operator()(double *x, double *par) {
        int k = x[0];

        int N = par[0];
        int K = par[1];
        double p = par[2];

        double P = 0;
        for (int n = 0; n <= min(k-K, K); ++n) {
            P += ROOT::Math::binomial_pdf(n, 1-p, K) * 
                 ROOT::Math::poisson_pdf(k-K-n, p*K);
        }

        return P;
    }
};

and here’s a minimal working macro:

{
    int N = 1e3, K = 3;
    double p = 0.99;

    TF1 *f = new TF1("f", WattsStrogatz(), 0, 18, 3);
    f->SetParameters(N, K, p);
    f->SetNpx(3600);

    TGraph *g = new TGraph(18-K);
    g->SetMarkerStyle(kFullCircle);
    g->SetMarkerSize(1.4);
    g->SetLineStyle(2);
    g->SetLineWidth(2);
    for (int i = 0; i < g->GetN(); ++i) {
        double k = i + K;
        g->SetPoint(i, k, f->Eval(k));
    }

    TH1F *h = new TH1F("h", "", 18, -0.5, 17.5);
    h->SetLineColor(kBlue);
    h->SetLineWidth(2);
    h->FillRandom("f", 1e7);

    TH1F *h2 = new TH1F("h2", "", 18, -0.5, 17.5);
    h2->SetLineColor(kMagenta);
    h2->SetLineWidth(2);
    h2->SetLineStyle(7);
    for (int i = 0; i < 1e7; ++i) {
        h2->Fill(f->GetRandom());
    }

    g->Draw();
    gPad->SetLogy();
    gPad->SetGrid();
    h->DrawNormalized("hist same");
    h2->DrawNormalized("hist same");
    f->Draw("same");
}

This is the result:

Note that the histogram entries are enough to get a statistical uncertainty of around 1% in the last bin, but both histograms are systematically higher than the TGraph and TF1.
This problem happens with any combinations of the PDF parameters.

Any ideas?

ROOT Version: 6.20
Platform: Ubuntu
Compiler: gcc

bellenot · March 30, 2020, 9:30am

Maybe @couet or @moneta could comment on this

Wile_E_Coyote · March 30, 2020, 9:33am

double k = i + K + 0.5; // half of the histogram bin width

moneta · March 30, 2020, 9:56am

HI,

You should set the number of internal bins used by TF1 corresponding to the discrete values.
For example in your case you have a range in [0,18], you should then do
f->SetNpx(18);

IN addition you need also to set the histogram bins correctly. These are used inside FillRandom(), which uses the integral of the function within a bin. For this reason the histogram axes should be:
18 bins min = 0, max = 18.

Lorenzo

ccorti · March 30, 2020, 10:11am

I set a large number of points for the TF1 so its plot would have been a step function.
Anyway, setting min = 0, max = 18 in the histogram fixed everything, no matter how many points the TF1 has.
Thanks!

moneta · March 30, 2020, 10:16am

I think the number of internal TF1 points is used in TF1::GetRandom, while in TH1::FillRandom the histogram bins are used. Only if you don;t specify the bins in the histogram then in that case the one from TF1 are used .
Also if you want to plot a discrete TF1, you should do,

f->SetNpx(18); 
f->GetHistogram()->Draw();

Lorenzo

ccorti · March 30, 2020, 10:19am

Thanks Lorenzo, this is much better!

moneta · March 30, 2020, 10:29am

One more thing, when using TF1::GetRandom you will still get correct results if your TF1 bins are multiple of 18. So with npx=3600, the result will be still correct. however it will be slower in term of CPU usage

Lorenzo

system · April 13, 2020, 10:29am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.