Transforming bivariate Gaussians instead of generating

Gramotey · June 27, 2023, 9:30pm

Hello everyone -

This is pretty much a math question. ROOT part is only keeping in mind we work with histograms TH2… Ok.
I am generating images for pencil beam scanning proton therapy, and I represent individual beam samples intensity profiles (cross section beam profiles) as covariant bivariate Gaussians, for starters.
We have many millions of such samples, having beam position X, Y, total intensity (size), SigmaX, SigmaY, and covariance angle Theta. These are experimentally measured values, they are in the can (a ROOT tree) and readily available.

Problem / Question:
It takes a long time to do FillRandom(“tf_bigaus”, nstat) after passing new parameters to the bigaus function…
Is there a simpler way somehow to have a distribution generated once, nice and good stats, as a histogram, and then to scale and stretch it in X and Y and rotate - to get a histogram for our samples without actually throwing random numbers again and again…?

Thank you very much in advance - all advice is welcome!
I am somehow just got short-circuited on this…
Warmest regards,
Evgeny.

Please read tips for efficient and successful posting and posting code

Please fill also the fields below. Note that root -b -q will tell you this info, and starting from 6.28/06 upwards, you can call .forum bug from the ROOT prompt to pre-populate a topic.

ROOT Version: 6.24
Platform: Debian 12
Compiler: g++ 12.2.0

mczurylo · June 28, 2023, 10:03am

Hello @Gramotey,

welcome back to the forum and thank you for your question. I think @moneta could help to solve your doubt.

Cheers,
Marta

moneta · June 28, 2023, 12:05pm

Hi,

I am not sure I have understood your goal. You have in a TTree N values of bi-variate gaussian parameters (a vector X and a covariance matrix C) and what you would like exactly to do ?
With FillRandom you sample for each N the corresponding distribution. Do you need to create N histogram for these distributions ?
If this is the need, it is maybe also faster to generate 2 correlated normal numbers, using for example the function Random::Gaussian2D.

ROOT::Math::GSLRngMT r;
double x,y;
r.Gaussian2D(sigmax, sigmay, rho,x,y);

Lorenzo

ferhue · June 28, 2023, 4:13pm

Have you considered plotting a bigaus TF2 directly?

If you need a discretized TF2 as a TH2, why not calculating the tf2 in each bin, instead of using FillRandom?

Gramotey · June 28, 2023, 7:23pm

Dear Lorenzo,
To clarify the problem, I need to create a combined total histogram that will have signals from all those little Gaussian “samples”, which is a sum of a couple of millions of individual histograms. What I am doing now, I do have a large “final” TH2 of a large size (1000x1000 ballpark), and then I generate Gaussians in another, small TH2 which I call a “patch” that are some 50x50 or so. Signals are small narrow beam profiles with sigma anywhere between 3 and 10 so 50x50 is good enough to get most of the signal in it. I generate Gaussian at the center of the “patch”, then knowing X and Y I place / copy that histogram contents into the large “final” one (so I am adding up those signals).

Even though I use a small patch to generate Gaussians, it is still too slow.
I realize that Gaussian2D is also throwing random numbers so… would it be the same as I do now,

tf_bigaus->SetParameters(SigI, 0., SigmaX, 0., SigmaY, cov);
hf_sptpth->FillRandom("tf_bigaus", nstat); // fill the small patch histogram, nstat is statistic, say 50,000 still too slow

What I was thinking, to generate some nice Gaussian distribution, 2D, symmetric, and then to transform it like we do in a graphics editor :)) stretch, and turn, and scale, to match the needed parameters in the end.

Just don’t know if that is at all possible, and the math approach involved, like 1-2-3… what do I need to do.

Any other suggestion how to speed things up - any alternative is most welcome at this point, I kind of squared my mind in a box and need help / ideas to get out of it haha
… by the way, there is a video. There supposed to be like, 60 layers. I am only drawing when a spot completes so each new update is actually about 100 individual Gaussians… too slow.

Gramotey · June 28, 2023, 7:28pm

@ferhue – sounds intriguing. Never done that…
So let me get this straight… having say a TH2 of say 50x50 bins, and a bigaus function, I use X, Y centers of every bin, as inputs… I am confused how to do that.
But I get that then I would only have 2500 times to calculate, and that sounds way better than throwing 100,000 random numbers…
Could you please help / some example or a pseudo code of what exactly do you mean?

My function is

//========================================================================
// parameters are “const”, “meanx”, “sigmax”, “meany”, “sigmay”, “correlation”:
Double_t f_bigaus(Double_t *x, Double_t *par){
    
    Double_t arg;
    arg =  par[0]*TMath::Exp(-0.5/(1.0-par[5]*par[5])
                            *(
                                 TMath::Power((x[0]-par[1])/par[2],2)
                                +TMath::Power((x[1]-par[3])/par[4],2)
                                -2.0*par[5]*(x[0]-par[1])
                                /par[2]*(x[1]-par[3])
                                /par[4]
                             )
                            );
    return arg;
};

moneta · June 28, 2023, 7:47pm

Hi,
You can do either throw random number, which if you use the Gaussian2D function is much much faster than FillRandom, if you are changing all the time the parameters, or evaluating the bivariate gaussian at the centre of the bins in a given grid. I am not sure which one is faster, but probably about the same.
If you need to save time you can run this early in parallel in multi-threads,
Lorenzo

Gramotey · June 28, 2023, 7:51pm

– I do not actually understand HOW to evaluate the bivariate gaussian at the centre of each of the bins… Yes horrible I know… =-}} I have a function defined in another post above. Could you help me by making a simple “skeleton” example?
I don’t understand as bins are wide / have width, how do I normalize? I am a bit lost, but I feel that this would be what I want to do, really.
… I will try the Gaussian2D that is easy enough tho. Thank you for that!

moneta · June 28, 2023, 8:12pm

It is easier throwing random number and it should be faster if you use N less than nbin^2

Lorenzo

ferhue · June 29, 2023, 5:38pm

This should do the job. You just need to take care of normalizing correctly the area / amplitude (integral of counts per beamlet).

TF2* f2 = new TF2("f2","bigaus");
Double_t meanX = -3;
Double_t sigmaX = 0.2;
Double_t meanY = 5;
Double_t sigmaY = 0.5;
Double_t correlation = 0.01;
Double_t amplitude = 1; // Change this with Area/sqrt(2pi sigmax sigmay) to normalize to Area. You can also divide by binWidthX*binWidthY if you want a density
f2->SetParameters(amplitude,meanX,sigmaX,meanY,sigmaY,correlation);// “const”, “meanx”, “sigmax”, “meany”, “sigmay”, “correlation”
Double_t maxSigmaFactor = 3; // stop after +/-3*sigma
TH2F* h2 = new TH2F("h2", "h2", 500, -10, 10, 500, -5, 8);
Int_t startX = h2->GetXaxis()->FindBin(meanX - maxSigmaFactor * sigmaX);
Int_t endX = h2->GetXaxis()->FindBin(meanX + maxSigmaFactor * sigmaX);
Int_t startY = h2->GetYaxis()->FindBin(meanY - maxSigmaFactor * sigmaY);
Int_t endY = h2->GetYaxis()->FindBin(meanY + maxSigmaFactor * sigmaY);

for(Int_t i = std::max(1,startX); i<=std::min(h2->GetNbinsX(),endX); ++i)
    for(Int_t j = std::max(1,startY); j<=std::min(h2->GetNbinsY(),endY); ++j)
        h2->SetBinContent(i,j,f2->Eval(h2->GetXaxis()->GetBinCenter(i), h2->GetYaxis()->GetBinCenter(j)));
auto c = new TCanvas();
h2->Draw("COLZ");

Gramotey · June 29, 2023, 9:39pm

Oh my, I see what is going on! Thank you @ferhue
Wow!
… now I am quite compelled to provide you with a generous amount of your favorite drink (beer etc.) we need to have the "buy a " functionality here haha

I really do appreciate all your help! Thank you!

system · July 13, 2023, 9:39pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.