Extended Fits with RooFit

Dear all,

I’m trying to make an extended fit in python using RooFit. I tried several things, but all of them crash. The unextended version (with n-1 coefficients compared to the number of pdfs in the RooAddPdf) works well, I now want to generalize it to the extended version.

The extended PDF I’m using is defined as :

mypdf=RooAddPdf(“mypdf”,“mypdf”,RooArgList(pdf1,pdf2),RooArgList(n1,n2))

with :
n1=RooRealVar(“n1”,“n1”,170000,0,500000)
n2=RooRealVar(“n2”,“n2”,320000,0,500000)

and :
Observables=RooArgSet(t,m)

Then, I tried :


1st version :
nexp=mypdf.expectedEvents(Observables)
data=RooDataSet()
data=mypdf.generate(Observables, NumEvent(nexp))


2nd version :
data=RooDataSet()
data=mypdf.generate(Observables, RooFit.Extended())


3rd version :
data=RooDataSet()
data=mypdf.generate(Observables, RooFit.Extended(kTRUE))


4th version :
data=RooDataSet()
data=mypdf.generate(Observables)


None of them succeeded and I don’t know what else to try. For the following fit, I have written :

mypdf.fitTo(data,Extended())

But the code even doesn’t go up to here, but crashes at the generation level.

Does anyone know what I’m doing wrong and how I should correct this to be finally able to make an extended fit ? It would be of great help !

Best Greetings,
Géraldine

Géraldine,

not a roofit expert here, but in order to help, can you post a runnable script (e.g. what is pdf1, pdf2, t, m)? Or, lacking that, how does it crash? Do you have a stack trace?

Cheers,
Wim

Dear Wim,

Below, you can find a simplified code that is running fine, if the “unextended” lines are used (a plot is also produced). However, when I want to make it “extended”, either by putting :

data = model.generate(RooArgSet(x),RooFit.Extended())

OR

nexp=model.expectedEvents(RooArgSet(x))
data = model.generate(RooArgSet(x), NumEvent(nexp))

the fit gives me :

File “TEST_EXT.py”, line 44, in
data = model.generate(RooArgSet(x),RooFit.Extended())
TypeError: none of the 4 overloaded methods succeeded. Full details:
RooDataSet* RooAbsPdf::generate(const RooArgSet& whatVars, Int_t nEvents, const RooCmdArg& arg1, const RooCmdArg& arg2 = RooCmdArg::none(), const RooCmdArg& arg3 = RooCmdArg::none(), const RooCmdArg& arg4 = RooCmdArg::none(), const RooCmdArg& arg5 = RooCmdArg::none()) =>
takes at least 3 arguments (2 given)
problem in C++; program state has been reset
RooDataSet* RooAbsPdf::generate(const RooArgSet& whatVars, Int_t nEvents = 0, Bool_t verbose = kFALSE) =>
could not convert argument 2 (Objects/longobject.c:211: bad argument to internal function)
RooDataSet* RooAbsPdf::generate(const RooArgSet& whatVars, const RooDataSet& prototype, Int_t nEvents = 0, Bool_t verbose = kFALSE, Bool_t randProtoOrder = kFALSE, Bool_t resampleProto = kFALSE) =>
could not convert argument 2

OR

Traceback (most recent call last):
File “TEST_EXT.py”, line 45, in
nexp=model.expectedEvents(RooArgSet(x))
TypeError: none of the 2 overloaded methods succeeded. Full details:
problem in C++; program state has been reset
problem in C++; program state has been reset


Here is the code :

from ROOT import *
import math
from math import pi
from array import array
############################################

OBSERVABLE

x = RooRealVar(“x”,“x”,0,10)
#############################################
#SIGNAL
mean = RooRealVar(“mean”,“mean”,5)
sigma1 = RooRealVar(“sigma1”,“sigma1”,0.5)
sigma2 = RooRealVar(“sigma2”,“sigma2”,1)
sig1 = RooGaussian(“sig1”,“sig1”,x,mean,sigma1)
sig2 = RooGaussian(“sig2”,“sig2”,x,mean,sigma2)
sigfrac = RooRealVar(“sigfrac”,“sigfrac”,0.8,0,1)
sig = RooAddPdf(“sig”,“sig”,RooArgList(sig1,sig2),RooArgList(sigfrac))
#########################################################
#BACKGROUND
a0 = RooRealVar(“a0”,“a0”,0.5,0,1)
a1 = RooRealVar(“a1”,“a1”,-0.2,0,1)
bkg1 = RooChebychev(“bkg1”,“bkg1”,x,RooArgList(a0,a1))
alpha = RooRealVar(“alpha”,“alpha”,-1)
bkg2 = RooExponential(“bkg2”,“bkg2”,x,alpha)
bkgfrac = RooRealVar(“bkgfrac”,“bkgfrac”,0.2,0,1)
bkg = RooAddPdf(“bkg”,“bkg”,RooArgList(bkg1,bkg2),RooArgList(bkgfrac))
########################################################
#TOTAL PDF
nsig = RooRealVar(“nsig”,“nsig”,17000,0,20000)
nbkg = RooRealVar(“nbkg”,“nbkg”,13000,0,20000)
n_sig = RooRealVar(“nsig”,“nsig”,0.3)
########################################################
#UNEXTENDED VERSION
model = RooAddPdf(“model”,“model”,RooArgList(bkg,sig),RooArgList(n_sig))
data = RooDataSet()
data = model.generate(RooArgSet(x),10000)
print "Finished generating, now fitting"
model.fitTo(data,
RooFit.Minos(False),
RooFit.Hesse(False))
########################################################
##EXTENDED VERSION
#model = RooAddPdf(“model”,“model”,RooArgList(bkg,sig),RooArgList(nsig,nbkg))
#data = RooDataSet()
#data = model.generate(RooArgSet(x),RooFit.Extended(kTRUE))
##nexp=model.expectedEvents(RooArgSet(x))
##data = model.generate(RooArgSet(x), NumEvent(nexp))
#print “Finished generating, now fitting”
#model.fitTo(data,RooFit.Extended(),

RooFit.Minos(False),

RooFit.Hesse(False))

########################################################
#PLOTS
testxframe = x.frame()
data.plotOn(testxframe, RooLinkedList())
model.plotOn(testxframe)
testcanvas = TCanvas(“testcanvas”, “plot”, 1200, 900)
testcanvas.cd(1)
testxframe.Draw()
testcanvas.SaveAs(“test_extended.eps”)

Thank you very much for your precious help,
Best Greetings

Géraldine

Geraldine,

okay, this is way beyond my knowledge of RooFit. However, when trying to run your code (I repladed NumEvent() by int() as the one modification), the issues only crop up when I omit the line:

So, to me, it looks like RooFit needs that extra RooRealVar (I don’t know why, b/c I don’t understand the code; nor do I understand why it doesn’t check rather than segfault, although some of the functions such as fitTo() do check). I’ll ping Wouter.

The stack traces that you post are the python-side only. The “problem in C++; program state has been reset” usually indicates a segfault on the C++ side, and a C++ stack trace is likely to be found upstream in the output as well (they are for me).

One remark, though. Lines like these:data = RooDataSet() data = ...
don’t actually do anything useful: the freshly created RooDataSet is immediately collected and destroyed after the only reference count pointing to it goes away. Likewise, by reusing the reference label “model”, older RooAddPdf’s may get deleted prematurely.

Cheers,
Wim

Dear Wim,

So, you mean that with putting the following lines, you had no problem ?

model = RooAddPdf(“model”,“model”,RooArgList(bkg,sig),RooArgList(nsig,nbkg))
data = RooDataSet()
nexp=model.expectedEvents(RooArgSet(x))
data = model.generate(RooArgSet(x),int(nexp))

Because, for me, it is not happy with the nexp=model.expectedEvents… line.
It would be indeed very nice of you if you could ask Wouter about it, as I think, I already checked the RooFit manual and the online tutorials about this extended fit and haven’t found the way to do it properly.

About the remark you made, does that mean that I should only put the data=model.generate… line and remove the one where I explicit that data should be a RooDataSet ?

Thanks so much Wim !!!

Géraldine

Hi,

Looking at this now. I haven’t seen anything that would be causing crashes when called as C++ code.

I need some help to understand how Python calls map to C++ class

model = RooAddPdf(“model”,“model”,RooArgList(bkg,sig),RooArgList(nsig,nbkg))

Does this map to

RooAddPdf* model = new RooAddPdf(…)

?
This line is OK in any case. The next one is indeed ineffective
as Wim also pointed out

data = RooDataSet()

The following line:

nexp=model.expectedEvents(RooArgSet(x))

should just work. Wim: what exactly does the ‘RooArgSet(x)’ do
when mapped to C++. Does it pass a reference or a pointer?
(Specifically for this method two versions exists, one taking a pointer
and one taking a reference. Which one is called)

data = model.generate(RooArgSet(x),int(nexp))

This should work fine, it will call the generate() method without RooCmdArg’s. I have the impression (well it is really anecdotal evidence) that those in general seem to be causing some trouble in the pyROOT interface.

Wouter

Hi Wim, Wouter,

Thank you for having a look at it.

Yes, this is right that : model = RooAddPdf(“model”,“model”,RooArgList(bkg,sig),RooArgList(nsig,nbkg))

maps in python to : RooAddPdf* model = new RooAddPdf(…)

The line : nexp=model.expectedEvents(RooArgSet(x))
already gives me the following segmentation fault :

Traceback (most recent call last):
File “TEST_EXT.py”, line 45, in
nexp=model.expectedEvents(RooArgSet(x))
TypeError: none of the 2 overloaded methods succeeded. Full details:
problem in C++; program state has been reset
problem in C++; program state has been reset

I haven’t found any additional information in my error and output files about it…

Géraldine

Hi,

So this seems to really comes down to how pyRoot maps the inline RooArgSet() construction in expectedEvents() to C++. The two overloaded methods are

virtual Double_t expectedEvents(const RooArgSet* nset) const ;
virtual Double_t expectedEvents(const RooArgSet& nset) const ;

Wouter

Hi,

This I don’t know about. Hopefully, Wim will be able to say more about this ?

Thanks a lot,

Géraldine

Hi,

[quote]model = RooAddPdf(“model”,“model”,RooArgList(bkg,sig),RooArgList(nsig,nbkg))
data = RooDataSet()
nexp=model.expectedEvents(RooArgSet(x))
data = model.generate(RooArgSet(x),int(nexp))[/quote]
yes, these worked fine for me.

[quote]Wim: what exactly does the ‘RooArgSet(x)’ do
when mapped to C++. Does it pass a reference or a pointer?
(Specifically for this method two versions exists, one taking a pointer
and one taking a reference. Which one is called)[/quote]
It fits either slot: there is no distinction between const ptr and const ref (non-const ptr does behave differently). The one called will be the one first tried.

[quote]Yes, this is right that : model = RooAddPdf(“model”,“model”,RooArgList(bkg,sig),RooArgList(nsig,nbkg))

maps in python to : RooAddPdf* model = new RooAddPdf(…)[/quote]
True, but if a second ‘model = …’ is done later, then that maps to ‘delete model; model = new …’. Hence my question whether old models are ok to be destroyed, or if they should be kept alive (i.e. should a reference be kept to it, or perhaps they should be released from refcounting completely)?

[quote] virtual Double_t expectedEvents(const RooArgSet* nset) const ;
virtual Double_t expectedEvents(const RooArgSet& nset) const ; [/quote]
This can be seen by doing:

Since the methods are considered equivalent, and given the order in the result, the pointer version will be selected. Because the pointer is const, the RooArgSet argument will remain under refcounting and will be deleted after the call.

One could explicitly select which version to call, using the full signature and disp(), but that will lead to somewhat hard to maintain code.

Is there an internal difference between the pointer and reference version of expectedEvents?

Cheers,
Wim

Dear Wim,

I do still have the crash at the “nexp=” line with the following lines:

model = RooAddPdf(“model”,“model”,RooArgList(bkg,sig),RooArgList(nsig,nbkg))
data = RooDataSet()
nexp=model.expectedEvents(RooArgSet(x))
data = model.generate(RooArgSet(x),int(nexp))

Could this be related to the RooFit version I’m using, as you don’t see the problem, or something I should import ? My version is v2.91.

Thanks,
Géraldine

Géraldine,

if there is a problem with deletion (i.e. if RooFit keeps hold of the pointer), then writing the code like this:argx = RooArgSet(x) nexp=model.expectedEvents(argx)
would work instead.

Version-wise, I’m running v2.95 (which corresponds to ROOT v22).

Cheers,
Wim

Hi Wim,

RooFit doesn’t keep the pointer after the call to expectedEvents()…

Wouter

Hi Wim and Wouter,

The last lines Wim gave me don’t work either. However, I have changed my version of RooFit. I’ve used an older one (v2.31) and there is no problem any more. At the moment, I don’t manage to have access to the v2.95 version, instead of the v2.91 one I’m using, but I guess this will solve the problem once I manage to do it. Thanks a lot for your help !

Cheers,
Géraldine