Troubles w/ a fit using a user-defined function +a histogram

furutaka · April 15, 2009, 1:08pm

Hi,

I’m writing a class to fit gamma-ray peaks observed in Ge detectors and having a few troubles (perhaps due to my lack of understanding). Could some kind soul explain why the following troubles occur and how I can fix them???

The attached TTest.C is a simplified version of the class, which can be tested using a sample script (multiHistFit.C) and a root file (test.root).

When I fit a peak with a built-in function (a Gaussian, which is commented out in the attached file), I don’t have any trouble (a successful fit and its display), but by using a user-defined function+histogram(actually a background determined by using TSpectrum), I got

Warning in <TCanvas::ResizePad>: Inf/NaN propagated to the pad. Check drawn objects. Warning in <TCanvas::ResizePad>: c1 height changed from 0 to 10 and the result of the fit is not displayed although the fit seems to be successful. What are the differences between them and how can I fix the code?

Updating of the canvas just before the fit cause a segfault. Why?

Thanks in advance.

Yours,
Kazuyoshi
test.root (40.6 KB)
multiHistFit.C (169 Bytes)
TTest.C (2.19 KB)

furutaka · April 15, 2009, 1:19pm

Ah, I forgot to mention my environment(s):
root v5.23/02 on Fedora 10 (i386 & x86_64).

Kazuyoshi

couet · April 15, 2009, 2:16pm

This message means that somewhere in your code your define a wrong canvas size. I will look to see if I find the mistake.

couet · April 16, 2009, 2:48pm

In your macro you loop 30 times on “par” but “par” has never been been dimensioned so you go outside the memory limits of the par vector.

furutaka · April 17, 2009, 12:39am

Dear Olivier,

Thanks for pointing out the cause of the problem. Indeed, the problem was caused by the fact that the fitting function ‘fFitFunc’ has only three parameters whereas the peak-fit function ‘fpeaks()’ loops over 30 peaks (90 parameters); and the following modification solved the immediate ploblem:

--- TTest.C.orig	2009-04-15 21:55:06.000000000 +0900
+++ TTest.C	2009-04-17 08:42:57.000000000 +0900
@@ -1,4 +1,4 @@
-const Int_t     kNumMaxPeaks   = 30;
+const Int_t     kNumMaxPeaks   = 1;
 const Int_t     kNumMaxPars    = 3*kNumMaxPeaks;
 const Int_t     kNumMaxHalfWid = 30;         // for FWHM estimate

Now I got a few other questions…

I think that it would be better if we can pass the number of peaks/parameters to the fitting function to loop as many times as exactly needed. Are there any way to pass the number of parameters to user-defined fitting functions? Is it the only way to pass the number of parameters as one of the parameters themselves??? (because the signature of user-defined function has to be Double_t fitf(Double_t *x, Double_t *par) and we can’t pass the number as an argument…)
My fitting function is a slightly modified version of the one in $ROOTSYS/tutorials/spectrum/peaks.C: it works but mine doesn’t, and I tried to find why. I thought that the clue was the use of TVirtualFitter::Fitter(obj,maxpar), which has no sufficient description in http://root.cern.ch/root/html/TVirtualFitter.html#TVirtualFitter:Fitter, but using the method with maxpar=100 (w/ kNumMaxPeaks=30 & the number of parameters declared in TF1() = 3) didn’t solve the problem. Could someone explain usage&purpose of the method?

Yours,
Kazuyoshi

furutaka · April 17, 2009, 6:54am

Hi again…

Another problem is that, when using a more elaborated version of the class from a (Python) script over the same TH1D histograms again and again, it sometimes fails to display the results of the fits: sometimes it fails silently, sometimes it tries to display the results in extremely larger ordinate scales (e.g. 10**240), sometimes with the same errors (NaN) as the first post…

The fits seem to be successful, because almost the same parameter values obtained when displayed successfully are obtained even in the failing cases.

The attached files contain console logs of the repeated python-script runs and the resultant postscript files. In a run, a single peak is searched for in a part of each of 25 TH1D histograms (hH00-hH31), and each peak found in the region was fitted with (a Gaussian)+(BG histogram estimated w/ TSpectrum) after estimating the Gaussian parameters using BG-subtracted histogram. Each histogram was captured before and after the fit to the attached postscript files.

Any suggestions to avoid failing of the display???

Yours,
Kazuyoshi
logs.tar.gz (6.6 KB)
figs.tar.gz (268 KB)

brun · April 17, 2009, 7:14am

Could you send the shortest possible RUNNING C++ script (not python) reproducing your problem?

Rene

furutaka · April 17, 2009, 8:50am

[quote=“brun”]Could you send the shortest possible RUNNING C++ script (not python) reproducing your problem?
[/quote]

Sorry, I don’t have time tonight; I’ll do tomorrow.

Kazuyoshi

furutaka · April 18, 2009, 5:17am

[quote=“brun”]Could you send the shortest possible RUNNING C++ script (not python) reproducing your problem?
[/quote]

The attached are (a simplified version of) class files (TGTsBGFit.{cxx,h}), a root file which contains histograms to be fitted (coin3953_sumprj.root), C++ (fit.C) as well as python(fit.py) scripts which do almost the same thing, and Bourne shell scripts (fit.C.sh and fit.py.sh) which runs the same C++ or python scripts 50 times (and save the fit results and console logs to separate files).

Please run fit.C.sh.

The class TGTsBGFit was written to

locate a few (gamma-ray) peaks in a region of a histogram,
estimate their positions, widths, and heights from a spectrum in which B.G. is subtracted using TSpectrum, and
fit the peaks starting from the estimated parameters.

coin3953_sumprj.root contains histograms observed in Ge detectors, and the fit.C and fit.py scripts fit 511-keV gamma-ray peaks in the histograms.

With fit.C, the problems occur extremely fewer (about once in every 50 fit.C runs), compared to the runs using fit.py (~20 times in 50 runs)…

Kazuyoshi
coin3953_sumprj.root (950 KB)
scripts.tar.gz (1.86 KB)
class-files.tar.gz (2.36 KB)

furutaka · April 18, 2009, 8:49am

Ah, forgot to attach one file, for the python version…

Files w/ .dat extension can’t be attached and the contents are listed in the following:

coin3953_sumprj.root hH00 0.742309 -0.822299 coin3953_sumprj.root hH01 0.746390 -0.650575 coin3953_sumprj.root hH02 0.750189 -1.004017 coin3953_sumprj.root hH03 0.748014 -1.015329 coin3953_sumprj.root hH04 0.750298 -1.078479 coin3953_sumprj.root hH05 0.741829 -0.491117 coin3953_sumprj.root hH06 0.758071 -0.318672 coin3953_sumprj.root hH07 0.745851 -2.518277 coin3953_sumprj.root hH08 0.750407 0.347851 coin3953_sumprj.root hH09 0.752432 -0.277543 coin3953_sumprj.root hH10 0.749916 -1.567872 coin3953_sumprj.root hH11 0.744130 -0.589049 coin3953_sumprj.root hH12 0.744559 -0.139822 coin3953_sumprj.root hH16 0.741616 -1.085679 coin3953_sumprj.root hH17 0.735589 0.869370 coin3953_sumprj.root hH18 0.749916 -0.817956 coin3953_sumprj.root hH19 0.740764 -0.497455 coin3953_sumprj.root hH24 0.749481 -1.270019 coin3953_sumprj.root hH25 0.752981 -1.403638 coin3953_sumprj.root hH26 0.751446 -1.110200 coin3953_sumprj.root hH27 0.754302 -0.794053 coin3953_sumprj.root hH28 0.753421 -1.702961 coin3953_sumprj.root hH29 0.744452 -1.554968 coin3953_sumprj.root hH30 0.753366 -0.912160 coin3953_sumprj.root hH31 0.749590 -2.094039

These are (linear) energy calibration coefficients for the histograms.

Kazuyoshi

furutaka · May 14, 2009, 2:16pm

I still can not resolve the problem…

A few notes:

Frequency of the problem depends strongly on architecture: it occurs often on i386 machines but is very rare on x86_64s.
The problem occurs when TF1::EvalPar() returns a strange (NaN or extremely large) value.

Kazuyoshi

brun · May 14, 2009, 2:50pm

Could you reduce the problem to the simplest possible C++ script (no Python please)?
I would suggest to run your system under valgrind. I am suspecting either an undefined variable or address out of range in the code.

Rene

furutaka · May 14, 2009, 3:13pm

Hi,

I do my test with the (simplified) class AND with a C++ script. Is the script what you need?

I did run it under valgrind, but couldn’t find anything. I’ll do it again.
I suspect that something (in my code or…) sometimes (no idea why) breaks what fFunctor points to and EvalPar() returns a meaningless value.

Kazuyoshi

couet · May 14, 2009, 3:24pm

In your code you do some loops over indeces without protection. For instance this one:

  for (Int_t np=0; np<fNumPeaksFound; np++) {
         [...]
         fPar[3*fNumPeaksAboveBG+2] = uppW-lowW;
         fNumPeaksAboveBG++;
         [...]
   }

what about if 3*fNumPeaksAboveBG+2 exceed the dimension of fPar ? I have just looked at your code… not ran it … so may be that bit is ok … But at least it looks suspicious to me.

brun · May 14, 2009, 3:26pm

Provide something like

root > .L myclass.C+ root > .x myscript.C
and of course the data file

Rene

furutaka · May 14, 2009, 8:34pm

Good morning!

[quote=“couet”]In your code you do some loops over indeces without protection. For instance this one:

  for (Int_t np=0; np<fNumPeaksFound; np++) {
         [...]
         fPar[3*fNumPeaksAboveBG+2] = uppW-lowW;
         fNumPeaksAboveBG++;
         [...]
   }

what about if 3*fNumPeaksAboveBG+2 exceed the dimension of fPar ? I have just looked at your code… not ran it … so may be that bit is ok … But at least it looks suspicious to me.[/quote]

Well, fNumPeaks{Found,AboveBG} are initialized to 0 in the constructor, and the number of peaks found is substituted into the former before the loop. The dimension of the array fPar at this moment is 5*3 (3 Gaussian parameters for 5 peaks), while I’m testing the code with one peak with TSpectrum::Search(). Therefore as far as the code above is concerned I think it’s OK.
The attached code is a simplified version of the original, i.e. many statements to check/confine range of variables are removed from the original to simpify.

Kazuyoshi

furutaka · May 14, 2009, 8:47pm

Good morning…

[quote=“brun”]Provide something like

root > .L myclass.C+ root > .x myscript.C
and of course the data file

Rene[/quote]

Please run the attached script with the contents of class-files.tar.gz and coin3953_sumprj.root which were attached to the previous posts, as the following:

root > .L TGTsBGFit.cxx+ root > .x manyfits.C
Thanks in advance.

Kazuyoshi
manyfits.C (1.41 KB)

furutaka · May 18, 2009, 9:21pm

Hi,

The problem seems to occur when some of the function parameters have strange values (e.g. nan): the parameters are initialized in InitPars() but some of the unused parameters (to describe the peaks found) are then altered to strange values somewhere and used in fpeaks().

but where and how?, I don’t know…

A secure fix would be to use a functor which only uses the necessary number of parameters (three times the number of peaks found) to calculate the corresponding function value.

Kazuyoshi

furutaka · May 19, 2009, 7:05am

Hi,

Are there any fitting examples in which functors (and TF1s) are used?

I made one but it doesn’t fit; with the “V” option of Fit() I see that initial values are OK, but the final values are not…

Yours,
Kazuyoshi

brun · May 19, 2009, 10:42am

Let me repeat my request;
Could you provide a simple script that can reproduce the problem in a short amount of time?
Note that your class is full of memory leaks. You do not delete most objects created in the constructor or after.

Rene