How to put both data samples in one single data set?


I have just read the post
"Trying a simultaneous fit in RooFit"

and I have a question that How to realize that
"put both data samples in one single data set with the flags “A” and “B”."

now I need to deal with the same problem.
I have two RooDataHists,

  RooDataHist datadp("datadp", "dataset with xd0 ", x ,hhdp);
  RooDataHist datad0("datad0", "dataset with xd0 ", x ,hhd0);

and I want to construct such a RooDataHist which comprises of that two RooDataHists.
so I can do the simultaneous fit.

any one could help me. thx

Posted: Fri Mar 31, 2006 14:42 Post subject: Trying a simultaneous fit in RooFit


I want to do something that looks to be very simple, but I got problems when I used RooSimultaneous class in RooFit. I have two data samples S1 and S2, in what I want to fit a gaussian to a signal mass peak and a polynomial to the background, this for each sample. The problem is that my samples have different mass windows. S1 is from m1_min to m1_max, and S2 is from m2_min to m2_max, with m1_min<m2_min and m1_max<m2_max. What I did was to declare the variable M (for mass), put both data samples in one single data set with the flags “A” and “B”, for each sample. I mean:

RooRealVar M(“M”,“Invariant mass”,m1_min,m2_max);
RooCategory tp(“tp”,“tp”);
RooDataSet data(“data”,“data”,RooArgSet(M,tp));

what does the bold words mean?

I hope someone can help me.

the Roofit manual 2.07 doesn’t include the 8th section (Disrete variable) and the 9th section( Multiple datasets and simultaneous fitting).
so I don’t understand the RooCategory and how to relate it to RooDataSet
or RooDataHist.

thx in advance.


What you want can be done, but isn’t documented in the manual yet (I’m aiming
for a complete version by the end of the year).

Here is short summary of how it works.

  1. Discrete variables: You can read more about them in the ‘old’ documentation
    here: … ide36.html. You might
    want to read this firest before you read the rest.

  2. In a simultaneous fit a discrete variable serves as index variable: it tells for every
    event in which sample it belongs (e.g. S1 or S2) For a simultaneous fit to two histograms
    this may seem trivial, but RooFit can also do unbinned simultaneous fit. So in the unbinned case each event is characterized by a pair of variables (x,c) where ‘c’ is the category that indicates in which subsample this event belongs. For binned data you should see the input data as an effective 2D histogram with e.g. 100 bins in x and 2 bins c, the sample index category.

  3. Now we get down to the special case that you have, where you have different ranges for x, depending on c.

There is the ‘old & ugly’ method: You just give the observable x a different names for the two samples, e.g. x1 and x2 (note both the C++ variable names
and the intrinsic names given as the 1st ctor argument should be different).

There is also the ‘new’ method, which is more elegant. The idea here is that you can in general perform a fit over a different range of x that the range declared in the RooRealVar. In the following example

RooRealVar x("x","x",-10,10) ;
x.setRange("fitRange",-5,5) ;
 // definition of pdf and data happen somewhere
pdf.fitTo(data,Range("fitRange")) ;

you fit only the subset of the data in the range x [-5,5]. This method of speficying the ranges can be extended to specify multiple ranges for simultaneous fits:

RooCategory c("c","Sample index") ;
c.defineType("Sample1") ;
c.defineType("Sample2") ;
RooRealVar x("x","x",-10,10) ;
x.setRange("fitRange_Sample1",-7,7) ;
x.setRange("fitRange_Sample2",-5,5) ;

pdf.fitTo(data,Range(“fitRange”),SplitRange(kTRUE)) ;

The latter option tells the simultaneous fit to not look for a single range “fitRange” that
applies globally, but rather to look for an individual range “fitRange_XXX” for each
subsample where XXX is the label of the RooCategory that classifies the samples.



For completeness on this forum, I attach the example I just sent privately as well.


RooCategory c(“c”,“c”) ;
c.defineType(“S1”) ;
c.defineType(“S2”) ;

RooRealVar x(“x”,“x”,-10,10) ;
x.setRange(“fitRange_S1”,-10,0) ;
x.setRange(“fitRange_S2”, 0,10) ;

RooRealVar m1(“m1”,“m1”,-5,-10,10) ;
RooRealVar m2(“m2”,“m2”, 5,-10,10) ;
RooRealVar s(“s”,“s”,5,0.1,10) ;
RooGaussian g1(“g1”,“g1”,x,m1,s) ;
RooGaussian g2(“g2”,“g2”,x,m2,s) ;

RooSimultaneous simPdf(“simPdf”,“simPdf”,c) ;
simPdf.addPdf(g1,“S1”) ;
simPdf.addPdf(g2,“S2”) ;

// Generate 1000 random values of c
// (little trick: g1 does not depend on c, uniform random distribution is generated)
RooDataSet* proto = g1.generate(c,1000) ;

// Generate 1000 points in x according to simPdf
// (sampled from g1 or g2, depending on input value of C)
RooDataSet* data = simPdf.generate(x,ProtoData(*proto)) ;

RooDataHist hdata(“hdata”,“hdata”,RooArgSet(x,c),*data) ;

simPdf.fitTo(hdata,Minos(0),Verbose(1),Range(“fitRange”),SplitRange(1)) ;


thank you, wouter!

I want to do a similar thing:
“Put both binned data samples in one single data set”, which is not answered in the reply.

In your example, you use “simPdf.generate()” to obtain the DataSet.
But in my case, I already have two TH1Fs, and I want to assign one TH1F with category “A”, and another category “B”, and put both into a single dataset.

Initially, one can do, just as mentioned before:

RooRealVar M(“M”,“Invariant mass”,m1_min,m2_max);
RooCategory tp(“tp”,“tp”);
RooDataSet data(“data”,“data”,RooArgSet(M,tp));

But how to add the two TH1Fs to the dataset?
Using something like: ?

RooDataHist datadp(“datadp”, "dataset with xd0 ", M ,hhdp);
RooDataHist datad0(“datad0”, "dataset with xd0 ", M ,hhd0);

How to make the bold codes work?