IParametricGradFunctionMultiDim fitting, ParameterGradient()

LeWhoo · September 10, 2009, 6:27pm

I implemented my own IParametricGradFunctionMultiDim with analytical gradients fo fitting purposes. However, I can see two strange things

During fit, always ParameterGradient() is called, which should calculate the whole gradient vector. Should it be so? I am not sure if calculating derivative for fixed parameters is necessary here.

If I implement ParameterGradient in an improper way, DoParameterDerivative is called instead, during fitting. It allows to specify for which parameter derivative should be calculated. However, it is again called for all parameters, no matter, which are fixed.

If fitting using minuit, automatically at the beginning of a fit, function and derivatives for all parameters=0 are calculated, even if for some parameters there are limits excluding 0.

LeWhoo · September 10, 2009, 7:20pm

Additionally, I have a problem with final step size. During fitting, ROOT starts to change my parameter with resonable step. However, at some point parameter changes are so small, that the function and function gradient do not change. This causes a warning, that “FUNCTION VALUE DOES NOT SEEM TO DEPEND ON ANY OF THE 1 VARIABLE PARAMETERS. VERIFY THAT STEP SIZES ARE BIG ENOUGH AND CHECK FCN LOGIC”.

Well, that’s definetly true, my function does not change for so small parameter changes. However, the question is - does it affect fit convergance in any way?

moneta · September 11, 2009, 6:57am

Hi,

Yes, the derivative for the fixed one are not necessary,i could probably skipped. I will see if I can easily implement it so it can be the skipped.
However, as you mention, ParameterGradient is implemented by default using DoParameterDerivative. You could simply provide a dummy implementation (for example returning zero) for the fixed parameters

At the beginning of the fit the function (and derivatives) are calculated for the initial values of the parameters (the one the user provides).
They might be zero if you don’t provide any value.
Please send me a log file produced by Minuit if you observe still this problem.

The parameter change in Minuit is calculated using the gradient and the Hessian. If you have a small change, is when the gradient goes towards zero and normally when you are close to the minimum.
If you have a region with zero gradient which is not the minimum, this is true could confuse Minuit. You should then set parameter values outside this region.

Best Regards

Lorenzo

LeWhoo · September 11, 2009, 10:22am

[quote]
Yes, the derivative for the fixed one are not necessary,i could probably skipped. I will see if I can easily implement it so it can be the skipped.
However, as you mention, ParameterGradient is implemented by default using DoParameterDerivative. You could simply provide a dummy implementation (for example returning zero) for the fixed parameters

Well, I know that this is not skipped for I need to debug gradient calculations deeply. But anyone who is not aware of fact of non-needed calculation of gradient will wonder why his fit takes so long So skipping non-needed gradient requests would be nice.

Other thing is that DoParameterDerivative is not called by default. By default ParameterGradient() is called and only if it is not available, DoParameterDerivative is called. You can see this in an attached sample - “ParameterGradient” is printed for every request, while “DoParameterDerivative” is not printed at all. So in this implementation one does not know, which gradient was requested.

Yes, I know that initial values may be 0 if I do not provide any This is not the case. See attached sample - parameters always start with 0 ( I think it was different with Fumili). It is so even if parameter limits exclude 0. And even in this sample you can see minuit warning about Chi2 points rejection, because of this 0 value parameters.

btw. what is “a log produced by Minuit”? How to obtain it?

[quote]
The parameter change in Minuit is calculated using the gradient and the Hessian. If you have a small change, is when the gradient goes towards zero and normally when you are close to the minimum.
If you have a region with zero gradient which is not the minimum, this is true could confuse Minuit. You should then set parameter values outside this region.
[/quote][/quote]

No, gradient is 0 only in minimum. However, I am not sure I understand. Let’s assume, fitter is close to the minimum. It changes parameter on the 4th decimal place, and the function value changes, getting it closer to the minimum. Than it changes parameter on the 5th decimal place, and function value changes a little. But than it changes parameter on 6th decimal place and such a small change of parameter does not change the function. Here it displays a message, that parameter change does not seem to affect the function. It is true, but this is what is expected… Do you mean that such a situation happens, when gradient changes with parameter change and function value does not change with parameter change?

Other question is: when the fitting procedure compares analytical gradients with numerical? In attached sample, sometimes the comparision is displayed, sometimes it is not displayed.
exampleGradFit.C (3.63 KB)

moneta · September 11, 2009, 12:40pm

Hi,

I agree it would be nice to skip automatically the derivative calculation for the const parameter, and, as I said before, I will try to add this feature, when using the ROOT::Fit::Fitter class. The problem is that both Minuit and Minuit2 requires in their API to provide the full gradient vector.

ParameterGradient()is used when is provided by the function. If you don’y provide an implementation of ParameterGradient() , then DoParameterDerivativeis used. You should then implement just DoParameterDerivative for the non-fixed parameters and return 0 for the others.

You should fit with a verbose option. In your example just uncomment the line:

fitter.Config().MinimizerOptions().SetPrintLevel(3);

This cannot happen, if the gradient is non zero the function must change.
And if the step size is not zero it means that the gradient is non zero.
Are you using Migrad or maybe Simplex ?
It could be also that your gradient calculation is wrong and you provide non -zero gradient when in reality should be zero.

For the check of the derivatives, this is done by default when using Minuit, before the minimization.
This check actually causes the problem you have observed with parameter values=0. It is done, by mistake, too early, when the parameter values have not been set and are by default all zero. I will fix this bug now in the ROOT trunk. As a workaround, you can use Minuit2 which does not perform this gradient check and then it does not have this problem.

Thank you for the report

Lorenzo

LeWhoo · September 11, 2009, 2:44pm

[quote=“moneta”]ParameterGradient()is used when is provided by the function. If you don’y provide an implementation of ParameterGradient() , then DoParameterDerivativeis used. You should then implement just DoParameterDerivative for the non-fixed parameters and return 0 for the others.
[/quote]

OK, I see. I thought that maybe it was a bug, but I see it is intentional. I’ll do as you say.

I was blind not to see that before. I was messing with ROOT::Math::MinimizerOptions::SetDefaultPrintLevel(), but with no big results. Shouldn’t it have the same result as fitter.Config().MinimizerOptions().SetPrintLevel()?

I am using Migrad. For now it seems, that tolerance was to low - I was trying to set it up with ROOT::Math::MinimizerOptions::SetDefaultTolerance() and as I can see now, with verbose output, without success. I hope this finishes my problems here…

However, in a meantime I tried to use ROOT::Math::Derivator(). Is this class already working? I could only get 0 return values from it. I guess this is a separate topic to discuss.

Thank you for all your help!