Accessing pdf values for different x-values (RooKeysPdf)

Hi all

I am having difficulty accessing different values from my pdf, given x-values.
I am using RooKeysPdfs and I am accessing a RooDataSet (all of which are held in a RooWorkspace).
I then want to loop over the RooDataSet and get the y-value of the pdf for each data point.
The following extract of code focuses on my problem. (The RooWorkspace* is w).

    RooKeysPdf* pdf = w->pdf("mykeyspdf");
    RooArgSet* set;
    double val;
    double prob;
    RooDataSet* d0 = (RooDataSet*)w->data("data");
    int entries = d0->numEntries();
    cout << entries << endl;
    for(int i = 0; i < entries; i++){
        set = (RooArgSet*)d0->get(i); 
        xi = (RooRealVar*)set->find(xi->GetName());
        val = xi->getVal();
        xi->setVal(val);
        prob = pdf->getVal(RooArgSet(*xi));
        cout << val << " --- " << prob << endl;
    }

(Nb: I’m pretty sure the xi->setVal(val) is redundant as xi is set to the value of that RooDataSet event.)

The value is working correctly in that it is looping through the dataset and reading different values for each iteration. However, the “prob” value is giving the same value for every single x-value, even though it is not a flat distribution. The value is also the same as that which is returned if I use pdf->Print(), so I am clearly not accessing the distribution correctly.

Could someone please provide some guidance, as I have worked my way through the documentation and posts here on RootTalk and have not been able to find a solution which works.

Many thanks,
Ian

Hi Ian,

The issue is the following: a dataset and the pdf each have their own copy of the RooRealVar x.
In your loop you only manipulate the x in the dataset, but never the x of the pdf.

Generically you can do the following if you want to loop over the daa

RooArgSet* pdfObs = pdf.getObservables(data) ;

for (int i=0 ; inumEntries() ; i++) {
*pdfObs = *data->get(i)

// now access pdf at x[i]
pdf->getVal()
}

Wouter

Hi Wouter

Thank you immensely for your speedy reply and for explaining my misunderstanding. Your prescription appears to have solved my problem and the pdf values are being written as expected.

Ian

Hi

Sorry for bringing this up again.

The template you provided worked fine, but now I am generating a dataset from a pdf and then want to get back from the pdf the y-values for each of the datapoints generated.

I am applying the same routine as before, but the RooArgSet* does not seem to be being set by the getObservables method.

I am effectively doing:

RooDataSet* data = new RooDataSet("data","data",RooArgSet(myvariable));
data = mypdf->generate(myvariable,50);
RooArgSet* myset = mypdf->getObservables(data);

When I use the Print() method on the RooDataSet, I get the pdf which generates the data and the variable name, but then when I calle the Print() method on the RooArgSet, I get: RooArgSet::dependents = (). So even then when I later set the RooArgSet to data->get(i) the getVal() does not return anything other than the mean (I think) of the pdf.

Is there a special usage I need to apply when I am generating the data, rather than calling data which has been saved into a RooWorkspace previously?

Well I have solved my issue, but maybe I should explain it full as it took me a couple of days of thinking to figure out.

I had a pdf which was generated with a RooRealVar in a range of -50 to 50.

I wanted to then sample a specific range of this pdf owing to the fact that I was interested only in the region of 0.1 to 0.9 and I knew how many events to expect in this range.

Naively I had created a new variable: RooRealVar xi_s = new RooRealVar(“xi_s”,“xi_s”,0.1,0.9);
I was then generating the pdf with this variable.
This was working fine and I was generating events with a range between 0.1 and 0.9 as I desired.

I then wanted to evaluate the pdf at each of these xi values, and following the prescription provided previously did not work. The RooArgSet was not being allocated the xi_s variable as a dependent. Only when drafting out an email to try and explain my problem fully (which I would have sent to Wouter) did I realise the problem.

The RooAbsPdf I am using has a variable xi. It is happy to generate data with other RooRealVars passed to it as the “observable”. However, when I was asking it to do pdf->getObservable(subsetDataset), it was confused because the pdf has its observable as a RooRealVar called xi whereas the dataset was using a RooRealVar called xi_s. It therefore did not set any variable and when I Print() the RooArgSet, I got () dependents - ie no dependents - no observables!

I realised now that in order to evaluate the pdf, I still need to use the variable xi, even though it’s range is not what I wanted to generate from.

My solution was to use my new RooRealVar called xi_s to generate the toy MC data with, but then when I created a RooArgSet* set = pdf->getObservable(*xi) I explicitly said that I required the observable xi (which is what created the pdf in the first place). So far so good. I am sampling the pdf in a given range and a I have a RooArgSet from which to evaluate the pdf.

The final step, which was kind of the stumbling block I guess is that in order to evaluate the pdf, I need to use the variable xi, not xi_s.

I therefore (after generation and accessing the generated value) used xi->setVal(value) to set the value of xi to be the value of xi_s which was generated between 0.1 and 0.9. I then evaluated the pdf at this value through xi using the prescription provided.

And that is how I sampled a pdf within a different range to the full pdf range and evaluated the pdf value within this subrange.

So again I keep running into a brick wall.

I now wish to evaluate a different pdf, which has the same RooRealVar xi as its observable when it was created and then using the xi_s values which have been generated within a smaller range I could evaluate it as I have done with the other pdf.

I though naively I could evaluate this different pdf using the same RooArgSet which has as its observable xi and because I have already used xi->setVal(xi_s_value) I should be able to do what I have done previously. However, this does not seem to work.

I though that as both pdfs are using the same observable that I could evaluate them using the same RooArgSet but this doesn’t seem to work. Could anyone provide any guidance as to how to proceed?

Edit : I should add that this approach does seem to be working for the pdf which did not have data generated in a smaller range in the RooDataSet, it is only for the other pdf which wants to evaluate xi in the xi_s range. I think my first step might be writing a new for loop and reevaluating it all, rather than trying to use some things already used, and seeing if that works, but this seems to be weird behaviour.

So to continue my long standing conversation with myself, and also to document my problems and solutions.

It seems that RooArgSet has a method called ->setRealValue(double value) which seems to allow me to set the value of the RooArgSet to that which was generated in the dataset and then pass it to the RooArgSet which I am using to access the different pdf and evaluate the pdf at that particular value.