Simultaneous fit to 3 datasets

Hello,
I have a problem when using Roofit to do simultaneous fit of 3 mass spectra (3 years of data). I created previously 3 datasets, one for each year. I add a tag to each dataset, indicating which year it is, append all the data in dataset data1,

RooCategory tagCat(“tagCat”,“tagging Category”) ;
tagCat.defineType(“dat04”);
tagCat.defineType(“dat03”);
tagCat.defineType(“dat06”);
tagCat.setLabel(“dat04”) ;
data1->addColumn(tagCat) ;
tagCat.setLabel(“dat03”) ;
data2->addColumn(tagCat) ;
tagCat.setLabel(“dat06”) ;
data3->addColumn(tagCat) ;
data1->append(*data2) ;
data1->append(data3) ;
RooSimultaneous sum(“sum”, “sum”, tagCat) ;
sum.addPdf(sum1,“dat04”) ;
sum.addPdf(sum2,“dat03”) ;
sum.addPdf(sum3,“dat06”) ;
RooFitResult
r1 = sum.fitTo(*data1, Range(1.4,3.5), Extended(true), Save(true)) ;
data1->plotOn(xframe1,Cut(“tagCat==tagCat::dat04”),MarkerSize(0.5)) ;
data1->statOn(xframe1,What(“N”),Cut(“tagCat==tagCat::dat04”)) ;
tagCat=“dat04”;
sum.plotOn(xframe1,Slice(tagCat),ProjWData(RooArgSet(tagCat,psimass),*data1));
sum.paramOn(xframe1,Parameters(RooArgSet(nsig1,sigmean,sigwidth1,nbkg1,slope1)),Layout(0.12,0.65,0.44));
xframe1->Draw();

My problem is that the first 2 datasets are complete, but the 3rd is truncated (although no
error or warning message is shown). When I run it, it produces the fits, but the number of entries in the 3rd set is smaller than it should, and error messages appear:

[#0] ERROR:Fitting – RooAbsTestStatistic::initSimMode: creating slave GOF calculator #0 for state dat04 (303 dataset entries)
[#0] ERROR:Fitting – RooAbsTestStatistic::initSimMode: creating slave GOF calculator #1 for state dat03 (628 dataset entries)
[#0] ERROR:Fitting – RooAbsTestStatistic::initSimMode: creating slave GOF calculator #2 for state dat06 (385 dataset entries)

The last line shows that it reads 385 entries instead of some 600.
What am I doing wrong? Please help me, I already lost more than 3 days trying to fix this.
Regards,

Catarina

======================================

Wouter will process your mail once he will be back online

Rene

Hi Catarina,

I assume that data3->numEntries() gives you the 600 events you expect?

To be sure that your composite dataset is OK, can you do

Roo1DTable* table = data1->table(tagCat) ;
table->Print() ;

after merging the datasets. This will give you the event count for each
year in your composite dataset.

Further down, the one obvious place I see in your macro where events get cut out
is the ‘Range(1.4,3.5)’ specification in your fit. If you do that only events
that survice that cut will be used in the likelihood for the fit.

Can you see what happens if you do

Roo1DTable* t2 = data1->table(tagCat,“psimass>1.4&&psimass<3.5”)
t2->Print() ;

If that reproduces the events count reported by RooAbsTestStatistic, than the events count reported are simply a reflection of the range cut you imposed.

Wouter

NB: the reported message is really an INFO message, not an ERROR message, this was a typo on my side when I ported the code to the new RooMsgService interface. It has been fixed meanwhile.

Dear Wouter,
Thanks for your hints, I advanced a bit on this.
data3->numEntries() gives me the correct number of entries. But after tagging and appending of datasets the
Roo1DTable* table = data1->table(tagCat) ;
table->Print() ;
shows that, for the 3rd dataset, the entries in some cases are no longer correct (they are less than it should). And it is not because I am doing the fit in a restricted range. The numbers I get are the following:

data1->numEntries()=303
data2->numEntries()=628
data3->numEntries()=593

Table tagCat : data1 (all merged)
±------±----+
| dat04 | 303 |
| dat03 | 628 |
| dat06 | 385 |
±------±----+

Table tagCat : data1 (all merged, psimass>1.4 && psimass<3.5)
±------±----+
| dat04 | 299 |
| dat03 | 623 |
| dat06 | 383 |
±------±----+

Table tagCat : data3 (psimass>1.4 && psimass<3.5)
±------±----+
| dat06 | 591 |
±------±----+

But the strangest thing I noticed is that this problem doesn’t occur always. I have 4 datasets (h1, h2, h3, h4) per year = 12 datasets. Merging the 3 years works fine for the
last 2 (i.e. h3 and h4) but not for the first 2 (h1 and h2). I do all the tag and appending the same way…
Any idea from where the problem might come?
Thank you again,

Catarina

Hi Catarina,

One possibility is that you have defined different ranges on the psi mass in your various datasets. If for example the psi ranges that is set for data1 is smaller than that of data3,
all entries in data3 that do not fit in the psi range of data are discarded.

You can investigate this as follows

data1->get()->Print(“v”) ;
data2->get()->Print(“v”) ;
data3->get()->Print(“v”) ;

This will show you the ranges that are set on all observables in these 3 datasets. They should be identical.

Wouter

Hello Wouter,
Indeed, it was a problem of ranges, not in psimass, but in another variable I am using.
Now it works just fine! :smiley:
Btw, the method you proposed
data1->get()->Print(“v”) ;
just shows me the variables values in the last row, like this:
RooArgSet:::

  1. RooRealVar::psimass[ ] = 0.60808
  2. RooRealVar::psiz[ ] = 0.477812
  3. RooRealVar::psipt[ ] = 0.233442
  4. RooRealVar::zprim[ ] = 60.5297
    So, to be sure it was a problem of ranges, I recreated all the datasets, now paying careful attention that the ranges be the same for the datasets to merge, and this fixed the problem.
    Thank you again,

catarina