Simultaneous fit using weighted data set

Dear Experts,

I want to perform simultaneous fit using two data samples, where each data sample contains two weighted RooDataSets. Here is a sample code;

#two fitting variables
Mbc = ROOT.RooRealVar("Mbc", 'M_{bc} [GeV/c^{2}]', 5.2,5.29)
deltaE = ROOT.RooRealVar("deltaE", "#DeltaE [GeV]", -0.15,0.1)

roo_data1 = df2roo(A1, {'Mbc': Mbc, 'deltaE':deltaE}) #first data set
roo_data2 = df2roo(A2, {'Mbc': Mbc, 'deltaE':deltaE}) #second data set
weightA1 = ROOT.RooRealVar( "weightA1", "weightA1",1) #weight factor for roo_data1
roo_dataA1 = ROOT.RooDataSet("roo_dataA1", "roo_dataA1", roo_data1, ROOT.RooArgSet(Mbc, deltaE, weightA1), "", "weightA1" ) #weighted data set A1
weightA2 = ROOT.RooRealVar( "weightA2", "weightA2",0.12)  #weight factor for roo_data2
roo_dataA2 = ROOT.RooDataSet("roo_dataA2", "roo_dataA2", roo_data2, ROOT.RooArgSet(Mbc, deltaE, weightA2), "", "weightA2" ) #weighted data set A2
roo_dataA1.append(roo_dataA2) #appending A1 with A2
roo_dataA1.Print() ##this will be input as the first data set in a simultaneous fit

roo_data3 = df2roo(A3, {'Mbc': Mbc, 'deltaE':deltaE})
weightA3 = ROOT.RooRealVar( "weightA3", "weightA3",1) 
roo_dataA3 = ROOT.RooDataSet("roo_dataA3", "roo_dataA3", roo_data3, ROOT.RooArgSet(Mbc, deltaE, weightA3), "", "weightA3" )

roo_data4 = df2roo(A4, {'Mbc': Mbc, 'deltaE':deltaE})
weightA4 = ROOT.RooRealVar( "weightA4", "weightA4",0.12) 
roo_dataA4 = ROOT.RooDataSet("roo_dataA4", "roo_dataA4", roo_data4, ROOT.RooArgSet(Mbc, deltaE, weightA4), "", "weightA4" ) 
roo_dataA3.append(roo_dataA4)
roo_dataA3.Print() ##this will be input as second data set in simultaneous fit

Let’s assume, roo_data1 and roo_data2 each contain 100 events. So, the weighted data set (roo_dataA1) should have 112 events and this is what I obtain. But, when I perform simultaneous fit for roo_dataA1 and roo_dataA3, the roo_dataA1 or roo_dataA3 is not weighted anymore, instead the fit show 200 events in each sample instead of 112 events. Here is a sample, how I am doing simultaneous fit;

sample = ROOT.RooCategory("sample", "sample")
sample.defineType("mumu")
sample.defineType("ee")

combData = ROOT.RooDataSet("combData","combined data",ROOT.RooArgSet(Mbc,deltaE),ROOT.RooFit.Index(sample),ROOT.RooFit.Import("mumu",roo_dataA1),ROOT.RooFit.Import("ee",roo_dataA3))
 
simPdf = ROOT.RooSimultaneous("simPdf", "simultaneous pdf", sample)
simPdf.addPdf(model_mumu,"mumu")  
simPdf.addPdf(model_ee,"ee") 

I can’t think of why the weighting doesn’t work for simultaneous fit, but if I perform fit separately for each case (mumu, ee), I am getting the result as expected.

Waiting for some valuable feedback!

Thanks

Hi @CS_p; @jonas or @moneta should be able to help you with your issue.

Cheers,
J.

Hi!

Unfortunately, the weights of the input datasets are lost when you create a combined dataset, unless you are specifying a WeightsVar(). More info here:

As this behavior is unexpected, I consider this as a problem and want to change it for ROO 6.28. There should be a weight variable by default if all the input datasets are weighted.

I hope this helps!
Jonas

PS: I see that you are using df2root from PyrooFit. Have you tried RooDataSet.from_pandas(), that comes with ROOT 6.26 onward? I’m just curious if you are using PyrooFit because from_pandas() didn’t work for you because of some bug, or you just didn’t know about it yet.

Hi @jonas,

Thank you very much for your suggestion!

Here is the sample, how I solved the problem,

weight1 = ROOT.RooRealVar( "weight1", "weight1",1.0) 
weight2 = ROOT.RooRealVar( "weight2", "weight2",0.12) 
roo_dataA1 = ROOT.RooDataSet("roo_dataA1","roo_dataA1",ROOT.RooArgSet(ROOT.RooArgSet(Mbc,deltaE),weight1),ROOT.RooFit.Import(roo_data1),ROOT.RooFit.WeightVar("weight1"))
roo_dataA2 = ROOT.RooDataSet("roo_dataA2","roo_dataA2",ROOT.RooArgSet(ROOT.RooArgSet(Mbc,deltaE),weight2),ROOT.RooFit.Import(roo_data2),ROOT.RooFit.WeightVar("weight2"))
roo_dataA2.Print()
roo_dataA1.append(roo_dataA2)

Similarly created another data set;

roo_dataA3 = ROOT.RooDataSet("roo_dataA3","roo_dataA3",ROOT.RooArgSet(ROOT.RooArgSet(Mbc,deltaE),weight1),ROOT.RooFit.Import(roo_data3),ROOT.RooFit.WeightVar("weight1"))
roo_dataA4 = ROOT.RooDataSet("roo_dataA4","roo_dataA4",ROOT.RooArgSet(ROOT.RooArgSet(Mbc,deltaE),weight2),ROOT.RooFit.Import(roo_data4),ROOT.RooFit.WeightVar("weight2"))
roo_dataA3.append(roo_dataA4)

For simultaneous fit data set;

combData = ROOT.RooDataSet("combData","combined data",ROOT.RooArgSet(ROOT.RooArgSet(Mbc,deltaE),weight2),ROOT.RooFit.Index(sample),ROOT.RooFit.Import("mumu",roo_dataA1),ROOT.RooFit.WeightVar("weight2"))
combData_ee = ROOT.RooDataSet("combData_ee","combined data",ROOT.RooArgSet(ROOT.RooArgSet(Mbc,deltaE),weight2),ROOT.RooFit.Index(sample),ROOT.RooFit.Import("ee",roo_dataA3),ROOT.RooFit.WeightVar("weight2"))
combData.Print()
combData_ee.Print()
combData.append(combData_ee)
combData.Print()

This solved my problem because I wanted to weigh only one data set from each sample, for example only roo_dataA2 from first set and roo_dataA4 from the second set. But, in case I have both sets weighted, then combData won’t work, as it doesn’t consider “ROOT.RooFit.WeightVar(“weight1, weight2”)”.