Different outputs with RooNLLVar+minimizer vs createNLL+minimzer vs fitTo for extended maximum likelihood fit

mghneimat · July 14, 2020, 3:19pm

Dear experts,

I would like do unbinned extended maximum likelihood fit (Simultaneous data and MC fit), with external constrains. For the nominal case i.e., without constrains, we construct the likelihood using:
method1 (RooNLLVar) then call the minuit2minimizer. But since ExternalConstraint isn’t possible using RooNLLVar, I’ve switched to:
method2 (createNLL) then call the same minuit2minimizer,
assuming that method1 and method2 will give similar results without any external constraint. However, I do have very different results between 1 and 2. Moreover, In method 2, I got many warnings (p.d.f normalization integral is zero or negative), and also a problem with normalization when calling plotOn method, although the fit has converged at the end. I am also pretty sure that I have passed the same options for both methods. Additionally, using
method3 (fitTo) for the same problem, I got no convergence.

I did similar comparison for a RooFit example (rf501_simultaneouspdf.C), as in the attachment, where I changed the model to be extended and added the three different fit methods for comparison. I got very tiny differences in the results as shown in the attached “output”.

Are the differences expected? or should they be identical to the last digit? Would theses differences get magnified if I have more complex models? since in my actual code I have e.g. Johnson S_u, Fermi-DiraC, Crystal Ball …etc and many fit components.

For your info, in my real code, I call the three methods as follows:

method1:
RooNLLVar nll("nll", "", *m_simul, *m_comb_data, extend, range, cpu);
method2:
RooAbsReal* nll = m_simul->createNLL(*m_comb_data, extend, range, cpu);

common to method1 and 2
RooMinimizer min(nll); min.optimizeConst(true); min.setMaxIterations(10000); min.setEps(1); min.setOffsetting(true); min.setMinimizerType("Minuit2"); min.migrad() m_result_migrad = min.save("migrad_result"); min.hesse(); m_result_hesse=min.save("hesse_result");

method3
m_result_hesse = m_simul->fitTo(*m_comb_data, extend, range, cpu, opt, type, offset, RooFit::Save());

where,
m_simul is RooSimultaneous object
auto extend=RooFit::Extended(true); auto range =RooFit::Range(m_xmin, m_xmax); auto cpu =RooFit::NumCPU(NCORES); auto opt = RooFit::Optimize(true); auto offset = RooFit::Offset(true); auto type = RooFit::Minimizer("Minuit2");

I am not sure what I is missing in the above calls, are there some “hidden” settings that cause the differences and I should set manual … or its normal to get that?

One last question, is there a way to use Externalconstrain with RooNLLVar?

thanks in advance,
Mazuza
mod_rf501_simultaneouspdf.C (6.3 KB)
output.txt (9.3 KB)

moneta · July 15, 2020, 8:54am

Hi,
Thank you for reporting this interesting problem. We need to look at it in detail.
Concerning your question you can add external constraints when calling RooAbsPdf::createNLL or you can just add by hand by summing the NLL with a function that is the log of the constraint pdf.
This is basically what is done in this code:
https://root.cern.ch/doc/master/RooAbsPdf_8cxx_source.html#l01105

Lorenzo

mghneimat · July 15, 2020, 9:09am

Dear @moneta,

Thanks for the reply.
I am avoiding calling RooAbsPdf::createNLL, until I approve that it can return the same results as RooNLLVar, if possible, since the second is my original method and my aim is to have a fair comparison with and without ExternalConstraints.
Thanks for the code, I will look into that.

cheers

moneta · July 15, 2020, 9:19am

Actusally after running your example code, I did not observe a significant different in the results of your 3 different fits. There are only small differences, but this can be attributed to a slightly different way the minimization is performed or the the order the way the likelihood is computed and they can easily explained by numerical error

Lorenzo

mghneimat · July 15, 2020, 9:24am

Hi,

that’s why I have asked the following questions:
Are the differences expected? or should they be identical to the last digit? Would theses differences get magnified if I have more complex models?

In my actual code as I mentioned I got really big differences apart from the warnings over the normalization when using createNLL. Although as you can see in the part of the code I posted, same options are passed to both methods. In fact, they are the same as in the simple example that you have just tried.

mghneimat · July 15, 2020, 9:26am

I don’t see a difference there but I see in my results. Do I miss other options?

moneta · July 15, 2020, 9:44am

Hi

You would need to set the initial parameter values and error to be the same in all cases. Different initial points might result in different minimisations

Lorenzo

mghneimat · July 15, 2020, 10:18am

Okay, then I will have a question about understanding the Simultaneous fit. The inputs to the simultaneous MC and data fit are the same for both methods, i.e., the initial parameter values and errors are stored in root files (they are the result of MC only fit) and provided to both methods. However, I am not sure if these values will be replaced at a later steps, for example, the result of the simultaneous MC fit are used as initial values for the data fit?
I’ve checked the output of result.floatParsInit() and found to be different in the two methods, but these values don’t agree with the values in the input root files, so I assume they were obtained at an intermediate step in the fit, or do you have a better explanation? thanks

moneta · July 15, 2020, 12:34pm

In the example you posted the initial values will be the one of the fit performed before. When comparing the different methods, it is better you save a snapshot of the initial parameter values and then you re-initialize later before fitting.
In the code above you do before the first fit:

RooArgSet initialParams; 
auto params = simPdf.getParameters(combData);
params->getSnapshot(initialParams); 
// do first minimisation.....

// and before second minimisation
*params = initialParams;

mghneimat · July 15, 2020, 2:55pm

Hi @moneta,
thanks for pointing that. This explains why I have different initial values in the posted output.txt. So now, I re-did with one method at a time, this returns back identical initial and final values (between, the tip you posted returns back an error: no member named ‘getSnapshot’ in ‘RooArgSet’).

But okay in my actual code, I do the fit with one method at a time, so I assumed the same initial values were used in each. However, retrieving the initial values with result.floatParsInit() for each method, indicates that they are different. So that, I’ve asked for a bit more explanation regarding what’s happening in the Simultaneous fit, I may then understand where these initial values came from. In my case the fit is done in two steps: first, I do only MC fit, save the workspaces and results, then provide them to the simultaneous MC and data fit. I was thinking that the result of the “only MC fit” will be used as initial values for the “simultaneous MC and data fit” thus they will be equal to result.floatParsInit(), but its not. I am missing something in between.

mghneimat · July 16, 2020, 1:39pm

Dear @moneta, if I may ask about the manual addtiton of the external constrains. I’ve tried and got the following error: Error in ROOT::Math::Fitter::SetFCN: wrong fit parameter settings.
I did as follows:

m_Bkg2 = new RooRealVar("RelFracBkg2","relFracBkg2 DATA-MC frac",                        1.,  0., 10.);
m_Bkg1 = new RooRealVar("RelFracBkg1","relFracBkg1 DATA-MC frac",                        1.,  0., 10.);
m_Bkg2_constraint = new RooGaussian("RelFracBkg2_constraint","RelFracBkg2_constraint",*m_Bkg2, RooFit::RooConst(1.0),RooFit::RooConst(0.051)) ;
m_Bkg1_constraint = new RooGaussian("RelFracBkg1_constraint","RelFracBkg1_constraint", *m_Bkg1, RooFit::RooConst(1.0), RooFit::RooConst(0.185)) ;
m_v_obj.push_back(m_Bkg2);
m_v_obj.push_back(m_Bkg1);
m_v_obj.push_back(m_Bkg2_constraint);
m_v_obj.push_back(m_Bkg1_constraint);
RooAbsReal* nll ;
auto theNLL  = new RooNLLVar("nll", "", *m_simul, *m_comb_data, extend, range, cpu);
nll = theNLL;
RooConstraintSum*  nllCons = new RooConstraintSum("nllCons","nllCons",RooArgSet(*m_Bkg2_constraint,*m_Bkg1_constraint), RooArgSet(*m_Bkg2, *m_Bkg1));
RooAbsReal* orignll = nll ;
nll = new RooAddition("nllWithCons","nllWithCons",RooArgSet(*nll,*nllCons)) ;
nll->addOwnedComponents(RooArgSet(*orignll,*nllCons)) ;
RooMinimizer min(*nll);

Can you see where is the problem? Note that I have no problems before the constrains, i.e., with only doing

RooNLLVar nll("nll", "", *m_simul, *m_comb_data, extend, range, cpu);`
RooMinimizer min(nll);

thanks

StephanH · July 16, 2020, 1:57pm

Mazuza, see this for posting code:

It will be easier to read, and probably also less work for you.

moneta · July 16, 2020, 1:59pm

Hi,
Difficult to get what went wrong and gives that error. Can you please post the full macro,
Thank you

Lorenzo

mghneimat · July 16, 2020, 3:34pm

Hi @moneta,

thanks for your reply, it looks to be caused by a flag I set. So I use the constrains only if the flag is set to true, otherwise then do the nominal fit. But before the flag, I do first initialization of nll and min, It could be that some inconsistency happened, since it worked when commenting out the flag and initialization so that I have only “with constraint” case.

I will look into that more and see the results, then I might come back here if I see problems.

thanks again
Mazuza

system · July 30, 2020, 3:34pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.