Reduce()ing Specific observables in RooDataSet

Hello experts,

It might sound stupid, but Is it possible in RooDataSet to project 2nd observable in a specific range of 1st observable?

For e.g. assume a dataset inherited from a TNtuple, which has the following structure-

RooDataSet ds(RooArgSet(M,X)) // M = 1000 entries with M_range [2,4] and X = 1000

I want to have X only for (M > 2.8 & M < 3.2) but M_range [2,4] no change in M-projection. For e.g.

RooDataSet ds_reduced(RooArgSet(M,X)) // M = 1000 (range [2,4]) ; X = 200 only for (M>2.8 & M < 3.2),

It is like having just a projection of X observable based M cut without affecting any entries/range of M.

Later I want to fit this ds_reduced by a 2-dimensional unbinned fit by a model(M,X), which I already defined.

Thanks in advance! :slight_smile:

@jonas Can you help?

@StephanH do you have any suggestions here?

Hi @hym! Sorry for the late answer.

If I understand correctly, you have a dataset with variables M and X. For one interval of M, you want to fit both M and X, and for another interval you want to fit only M.

What I would suggest is that you associate a RooCategory to the two different intervals of M and then do a simultaneous fit in both categories with RooSimultaneous. In one category, you only fit M and in the other you fit the conditional p(X|M)*p(M). Maybe X doesn’t depend on M in your case, but then the more general conditional will work just as well.

I have written the code to do this with the example of Gaussian distributed M and X:

RooDataSet makeFakeDataMX() {

   // create the RooDataSet with M and X,
   // nothing special here

   // m: mean 5, sigma 3
   // x: mean 4, sigma 2
   RooRealVar m("m", "m", 0, 10);
   RooRealVar x("x", "x", 0, 10);
   RooArgSet coord(m, x);

   RooDataSet dataSet{"data1", "data1", RooArgSet(m, x)};

   for (int i = 0; i < 10000; i++) {
      double tmpm = gRandom->Gaus(0., 3.);
      double tmpx = gRandom->Gaus(-1, 2);
      if (fabs(tmpm) < 5 && fabs(tmpx) < 5) {
         m = tmpm + 5.;
         x = tmpx + 5.;
         dataSet.add(coord);
      }
   }

   return dataSet;
}


RooDataSet makeFakeDataMXWithCat() {

    using namespace RooFit;

    RooRealVar m("m", "m", 0, 10);
    RooRealVar x("x", "x", 0, 10);

    RooCategory fitCat("fitCat", "Fitting category");
    fitCat.defineType("fitCatM");
    fitCat.defineType("fitCatMX");

    RooDataSet data = makeFakeDataMX();

    // get reduced datasets for both fit intervals
    std::unique_ptr<RooDataSet> dataM{static_cast<RooDataSet*>(
        data.reduce(RooArgSet{m, x}, "!(m > 2 && m <= 8)")
    )};
    std::unique_ptr<RooDataSet> dataMX{static_cast<RooDataSet*>(
        data.reduce(RooArgSet{m, x}, "m > 2 && m <= 8")
    )};

    // associate categories with datasets in intervals
    std::map<std::string,RooDataSet*> dataMap;
    dataMap["fitCatM"] = dataM.get();
    dataMap["fitCatMX"] = dataMX.get();

    // create new data set that has as many entries as the original,
    // and also an additional column with the fit category
    return RooDataSet{
        "data",
        "data",
        RooArgSet(m, x),
        Index(fitCat),
        Import(dataMap)
    };
}


void example() {

    using namespace RooFit;

    RooRealVar m("m", "m", 0, 10);
    RooRealVar x("x", "x", 0, 10);

    RooCategory fitCat("fitCat", "Fitting category");
    fitCat.defineType("fitCatM");
    fitCat.defineType("fitCatMX");

    RooDataSet data = makeFakeDataMXWithCat();

    // model for m
    RooRealVar mean_m("mean_m", "mean_m", 0., 10.);
    RooRealVar width_m("width_m", "width_m", 1., 10.);
    RooGaussian model_m("model_m", "model_m", m, mean_m, width_m);

    // model for x
    RooRealVar mean_x("mean_x", "mean_x", 0., 10.);
    RooRealVar width_x("width_x", "width_x", 1., 10.);
    RooGaussian model_x("model_x", "model_x", x, mean_x, width_x);

    // model for m and x
    RooProdPdf model_mx("model_mx", "model_mx", model_m,
                        Conditional(model_x, x));

    RooSimultaneous sim("pdf", "pdf", fitCat);
    sim.addPdf(model_m, "fitCatM");
    sim.addPdf(model_mx, "fitCatMX");

    auto * result = sim.fitTo(data,
                              Verbose(false),
                              PrintLevel(-1),
                              Save());

    // print fit results
    result->Print();
}

I hope this idea helps you so solve your problem!

If you have further questions, please let me know.

Jonas