Adding constraints to RooSimultaneous

Is there a good way to add constraints to a RooSimultaneous pdf? Here’s what I’m doing right now. simPdf is a pdf generated by histfactory.

top_ratio_val = temp_file.Get("top_ratio")[0] ws.factory('expr::top_ratio("n_of_top/n_sf_top", n_of_top, n_sf_top)') ws.factory('RooGaussian::top_ratio_constraint(top_ratio, nom_top_ratio[{0}], {1})'.format(top_ratio_val, top_ratio_val*0.1)) vv_ratio_val = temp_file.Get("vv_ratio")[0] ws.factory('expr::vv_ratio("n_of_vv/n_sf_vv", n_of_vv, n_sf_vv)') ws.factory('RooGaussian::vv_ratio_constraint(vv_ratio, nom_vv_ratio[{0}], {1})'.format(vv_ratio_val, vv_ratio_val*0.1)) ws.factory('PROD:constrPdf(simPdf, top_ratio_constraint, vv_ratio_constraint)')

However, I’m running into a lot of problems with this. Fitting data works fine, but generating and fitting toys for some reason does not. In order to create binned toys, I have to generate them from simPdf, otherwise I get the wrong number of events.

    RooDataSet* toy_data = (RooDataSet*)simPdf->generate(*model->GetObservables(), AllBinned(), Extended(), Verbose(true));
    toy_data->Print();

RooDataSet::hmaster[obs_x_of,obs_x_sf,weightVar,channelCat] = 58 entries (10758 weighted) ← correct

    RooDataSet* toy_data = (RooDataSet*)constrPdf->generate(*model->GetObservables(), AllBinned(), Extended(), Verbose(true));
    toy_data->Print();

RooDataSet::wu[obs_x_of,obs_x_sf,channelCat,weight:weight] = 1682 entries (3.10662e+06 weighted) ← incorrect

This is weird, but I can live with it. However, when I try to fit the constrained pdf to toy data, it fails. A closer look reveals that the NLL is returning 0.

//doesn't work
    RooNLLVar *nll_constr_toy = (RooNLLVar*)constrPdf->createNLL(*toy_data, Constrain(constr), Range("fitRange"), SplitRange());
    nll_constr_toy->Print();

RooAddition::nll_constrPdf_hmaster_with_constr[ nll_constrPdf_hmaster + nll_constrPdf_hmaster_constr ] = [#0] ERROR:Eval -- RooAbsReal::logEvalError(constrPdf) evaluation error,
 origin       : RooProdPdf::constrPdf[ simPdf * top_ratio_constraint * vv_ratio_constraint ]
 message      : getLogVal() top-level p.d.f evaluates to zero
 server values: !pdfs=(simPdf = 0/1,top_ratio_constraint = 0.995084,vv_ratio_constraint = 0.943742)
[#0] ERROR:Eval -- RooAbsReal::logEvalError(constrPdf) evaluation error,
 origin       : RooProdPdf::constrPdf[ simPdf * top_ratio_constraint * vv_ratio_constraint ]
 message      : getLogVal() top-level p.d.f evaluates to zero
 server values: !pdfs=(simPdf = 0/1,top_ratio_constraint = 0.995084,vv_ratio_constraint = 0.943742)
[#0] ERROR:Eval -- RooAbsReal::logEvalError(constrPdf) evaluation error,
 origin       : RooProdPdf::constrPdf[ simPdf * top_ratio_constraint * vv_ratio_constraint ]
 message      : getLogVal() top-level p.d.f evaluates to zero
 server values: !pdfs=(simPdf = 0/1,top_ratio_constraint = 0.995084,vv_ratio_constraint = 0.943742)
[#0] ERROR:Eval -- RooAbsReal::logEvalError(constrPdf) evaluation error,
 origin       : RooProdPdf::constrPdf[ simPdf * top_ratio_constraint * vv_ratio_constraint ]
 message      : getLogVal() top-level p.d.f evaluates to zero
 server values: !pdfs=(simPdf = 0/1,top_ratio_constraint = 0.995084,vv_ratio_constraint = 0.943742)
[#0] ERROR:Eval -- RooAbsReal::logEvalError(constrPdf) evaluation error,
 origin       : RooProdPdf::constrPdf[ simPdf * top_ratio_constraint * vv_ratio_constraint ]
 message      : getLogVal() top-level p.d.f evaluates to zero
 server values: !pdfs=(simPdf = 0/1,top_ratio_constraint = 0.995084,vv_ratio_constraint = 0.943742)
[#0] ERROR:Eval -- RooAbsReal::logEvalError(constrPdf) evaluation error,
 origin       : RooProdPdf::constrPdf[ simPdf * top_ratio_constraint * vv_ratio_constraint ]
 message      : getLogVal() top-level p.d.f evaluates to zero
 server values: !pdfs=(simPdf = 0/1,top_ratio_constraint = 0.995084,vv_ratio_constraint = 0.943742)
[#0] ERROR:Eval -- RooAbsReal::logEvalError(constrPdf) evaluation error,
 origin       : RooProdPdf::constrPdf[ simPdf * top_ratio_constraint * vv_ratio_constraint ]
 message      : getLogVal() top-level p.d.f evaluates to zero
 server values: !pdfs=(simPdf = 0/1,top_ratio_constraint = 0.995084,vv_ratio_constraint = 0.943742)
inf

// works
    RooNLLVar *nll_constr_real = (RooNLLVar*)pdf->createNLL(*data, Constrain(constr), Range("fitRange"), SplitRange());
    nll_constr_real->Print();

RooAddition::nll_constrPdf_obsData_with_constr[ nll_constrPdf_obsData + nll_constrPdf_obsData_constr ] = -34851.8

// works
    RooNLLVar *nll_toy = (RooNLLVar*)simPdf->createNLL(*toy_data, Constrain(constr),Range("fitRange"), SplitRange());
    nll_toy->Print()

RooAddition::nll_simPdf_hmaster_with_constr[ nll_simPdf_hmaster + nll_simPdf_hmaster_constr ] = -27708.1

Can someone help me figure out what’s going on here? Is this usage not supported? Have I uncovered a bug? Is there a better way to accomplish what I’m trying to do?

Thanks,
Nic

Hello,

I’d like to bump this thread up and see if we can get a response.

I too am encountering the situation where if I am unable to ‘factorize’ out common constraint terms across categories, because if I do this, I get the wrong expected events coming from the model (the integrators that get set up end up integrating over the category too).

It seems like a call to createNLL on a RooProdPdf that contains a RooSimultaneous ends up with the integrator for the terms in the RooSimultaneous not having the category removed. Here’s a simple example:

RooWorkspace* w  = new RooWorkspace;

  //construct a 2-channel model (yield = 3.5*np and 5.5*np), with a nuisance parameter constrained by auxilliary measurement (of 1 +/- 0.1)

  w->factory("dummy_obs[0,1]");
  w->factory("dummy_cat[a,b]");

  w->defineSet("obs","dummy_obs,dummy_cat");

  w->factory("dummy_np[1,0,5]");

w->factory("ASUM::model_a(dummyConst[1]*expr::mean_a('3.5*dummy_np',dummy_obs,dummy_np))");
w->factory("ASUM::model_b(dummyConst*expr::mean_b('5*dummy_np',dummy_obs,dummy_np))");

w->factory("Gaussian::dummy_constraint(dummy_np_nom[1],dummy_np,0.1)");

  w->factory("PROD::fullmodel_a(model_a,dummy_constraint)");
  w->factory("PROD::fullmodel_b(model_b,dummy_constraint)");

  w->factory("SIMUL::model1(dummy_cat, a = fullmodel_a, b = fullmodel_b)"); //model with constraint inside each pdf

  w->factory("PROD::model2( dummy_constraint, SIMUL::simPdf(dummy_cat, a = model_a, b = model_b))"); //model where dummy_constraint is factorised out

  RooDataSet data("data","data",*w->set("obs"));
  data.add(*w->set("obs"),4);w->cat("dummy_cat")->setIndex(1);data.add(*w->set("obs"),7);
  
  w->pdf("model2")->createNLL(data);//this line will cause integrators over both dummy_obs and dummy_cat to be set up for mean_a and mean_b, giving expectedEvents = 17, when clearly it is supposed to be 8.5 events!