Fit waveform by Roofit

Dear all,
I want to fit a waveform which is overlapped by two signals like the figure shows(blue line). I choose two gaussian function.


When I tried to use Roofit to handle these data with

    x = ROOT.RooRealVar("x", "Time", St[0], St[len(St)-1])
    y = ROOT.RooRealVar("y", "Signal", 0,10)        

    data_set = ROOT.RooDataSet("data", "dataset with x and y", ROOT.RooArgSet(x, y))
    for i in range(len(St)):
            x.setVal(St[i])
            y.setVal(Sa[i])
            data_set.add(ROOT.RooArgSet(x, y))

and

    mean1 = ROOT.RooRealVar("mean1", "mean of Gaussian 1", 0.0002, 0, 0.003)
    sigma1 = ROOT.RooRealVar("sigma1", "width of Gaussian 1", 0.0005, 0.0001, 0.1)
    A1 = ROOT.RooRealVar("A1", "amplitude of Gaussian 1", 0.1, 0,10)

    mean2 = ROOT.RooRealVar("mean2", "mean of Gaussian 2", 0.0003, 0, 0.004)
    sigma2 = ROOT.RooRealVar("sigma2", "width of Gaussian 2", 0.0005, 0.0001, 0.1)
    A2 = ROOT.RooRealVar("A2", "amplitude of Gaussian 2", 0.1, 0, 10)

    gauss1 = ROOT.RooGaussian("gauss1", "Gaussian 1", x, mean1, sigma1)
    gauss2 = ROOT.RooGaussian("gauss2", "Gaussian 2", x, mean2, sigma2)

    signal = ROOT.RooAddPdf("signal", "sum of gaussians", ROOT.RooArgList(gauss1, gauss2), ROOT.RooArgList(A1, A2))

    signal.fitTo(data_set)
    c = ROOT.TCanvas("rf101_basics", "rf101_basics", 800, 400)
    frame = x.frame()
    data_set.plotOn(frame)

The result is as shown. Why I can’t draw the original waveform? And is there a suitable function to fit it?


ROOT Version: 6.32.02
Platform: macosxarm64
Compiler: clang-1500.3.9.4


Hello @dm-leo,

can you plot the dataset on its own, without the fit function? This would be a first step to see what’s going on.
Maybe it will be beneficial to convert the dataset either to a weighted dataset that has an x-coordinate, and you use y as the weight of the datapoint, or to go to a datahist, with x being the binned coordinate, and y the bin content.
The reason is that you have a 2-D dataset at the moment, with weights of one, so you would have to fit a 2-D function as well. What you want instead is a function in x, though, so only one variable.

I will call in @jonas to have a look for more ideas.

Hi Stephan,
The second plot is just the data. Of course I can’t plot dataset like the first figure. In my data, x is time and y means amplitude. It’s a waveform from oscilloscope. So it’s curious for adding a weight with x or y. What’s the meaning of physics? Because it’s my first time to use roofit, where could I get more detailed information for fitting waveform?
Thanks a lot for your reply :grinning:

I think I understand the problem a bit better now. You created a 2-D dataset, and plotOn asks to plot it as 1-D in x. Now, what RooFit is going to do, is to project out the x component of your dataset, and plot it.
What is the x component? It’s a bunch of values that go from about 0 to about 0.0045. This is precisely what your plot shows.
If you asked it to plot on a frame of y, you would see the y component.

The misunderstanding is how RooFit is created: It’s for counting experiments.
In HEP, you typically do counting experiments, so you count how many times you observed a certain value. Let’s say I saw 10 times a mass of 5, and 1 time a mass of 1. The mass in this case is called “observable”, and we have only one observable.

Let me give an example: Take this histogram:

_ _ x _

Its dataset would look like [ 2 ]. That is, you have one observation where x=2.
Conversely, a dataset with three observations would be:

    x
x _ x _

[0, 2, 2] We saw one time the 0, two times the 2. That’s one observable, but 3 entries.

You see that in RooFit, these datasets are treated as 1-dimensional (they have one observable),in your case that’s the “Time” dimension. The vertical dimension of the plot is just the number of observations.
If you do a 2-D dataset, you have to plot it as 3-D. You have observations in (x,y), but the z coordinate is the number of counts you observed. And so on.

So let’s come back to your case:
The oscilloscope plotted (Time,Signal), but the function you want to fit is a function only in the time variable, so it’s 1-D what concerns RooFit. So how can we express the fact that we have “many counts” in the peak, and “few counts” on the left? With weights. So we convert your y-variable to weights, and the Time serves as observable.

This is the tutorial where weights are introduced:
https://root.cern/doc/master/rf403__weightedevts_8py.html

Hi @dm-leo and @StephanH,

I’m not sure if you can even interpret the y value as “counts” like in a counting experiment in this case.

RooFit is mainly meant for modelling probability densities and fitting them to data, but here you don’t want to fit a pdf buf a function y = f(x), right?

We also support this with a chi2-fit, see this tutorial:
https://root.cern/doc/master/rf609__xychi2fit_8py.html

But for RooFit, this is a corner case that is not supported by all of RooFits features. Why did you choose RoFit to fit the waveform? Why not the regular TGraph::Fit()](ROOT: TGraph Class Reference)?

Thanks very much!
The Roofit is not like TGraph, but similar with TH1. The count must be known. The process of Roofit must with histogram right? So we must find another way to handle them. I will study the link you provided and make sure if the waveform could be done with Roofit. :wink:

Hi Jonas,
Of course I tried to use TGraph::Fit(). The result is similar with the first figure. What I want to do is to distinguish two waveforms from the data.
Because the data is from oscilloscope, so y is not count like histogram. The overlapped signal is difficult to be divided by normal fit. So my instructor suggests me Roofit.
I will have a look for your link. Thanks a lot. :wink: