Dear all,
I am trying to solve the following problem with Roofit/RooStats. I have a counting experiment, made by independent runs. In each run, there is a “beam-ON” time, with measured number of events n_ON^i, and a “beam-OFF” time, with measured numbers of events n_OFF^i. The overall likelihood reads:
where f_i is the luminosity for run i over the total luminosity, and tau is the beam-OFF/beam-ON ratio for the run i. I want to set a limit on S, Bi are the nuisance parameters, and nON_i and nOFF_i the observables.
I tried to implement this model in roofit. The code I am using is attached (the first part is to read the data from a txt file). In particular, the overall PDF is a RooProdPDF of many terms, one per run - each term being a product of two Poisson PDFs.
I noticed that, when using this code, if the number of runs I am considering is more than 15, the code I use to find the upper limit (HipoTestInverter) crashes. Maybe there is a limit on the number of terms I can use in this product?
I attach to this message the 3 files necessary to reproduce the problem: the root macro to create the model, the root macro for Hypothesis inversion, and the file with data.
Dear @jalopezg,
thanks! Maybe the issue here is the following: I am considering a “single” experiment, made by many runs, i.e. my dataset, view as a table, where each row is one “event” and each column is one observable, at this moment is a single-row, multi-column dataset (the number of columns is 2*N, where “N” is the number of runs).
Maybe I should change approach, and consider the same problem treating the independent runs as independent rows in the dataset? The full PDF is still product over all terms.
To be honest, I have almost no experience with RooFit and I cannot personally help you with this, but I am sure @moneta can point you in the right direction.
Sorry for my late reply. I have downloaded the code and it seems to run fine. How can I reproduce your problem ?
I am using ROOT master, which ROOT version are you using ?
Hi Lorenzo,
I am using root version 6.18/04 (apologies for not having reported this before). In order to reproduce the issue, you can change the “14” in the data.txt file to “20” or more. This is the number of runs that are considered, i.e. the number of lines in the data.txt file that are read.
Actually I cannot reproduce the crash with ROOT >= 6.22 and 6.24 on Linux, but I can on MaOS.
I suspect there is an issue on the sorting of a large vector. Which operating system are you using ?
Hi @moneta,
I confirm this fix is working properly on MacOS.
I still have an issue with this code: if I increment too much the number of “runs” that I am considering (first number in the data.txt) file, I start to get non-sense results.
I am currently comparing the result of the Likelihood I wrote before with that of a simpler likelihood, simply obtained all the “beam-on” periods and, separately, all the “beam-off” periods - this is the standart Non/Noff problem. For the simpler likhelihood, I using the AsymptoticCalculator, while for the full model, I am using both the FrequentistCalculator and the AsymptoticCalculator.
If nRuns <= 4, the results I obtain for the full model are reasonable, with both calculators.
If nRuns == 5, instead, I start to see some problems. For example, the MINUIT output reports errors (see below). This suggests me that there may be some issue with my model?
MIGRAD WILL VERIFY CONVERGENCE AND ERROR MATRIX.
MINUIT WARNING IN HESSE
============== Second derivative zero for parameter1
MNHESS FAILS AND WILL RETURN DIAGONAL MATRIX.
MIGRAD MINIMIZATION HAS CONVERGED.
MIGRAD WILL VERIFY CONVERGENCE AND ERROR MATRIX.
MINUIT WARNING IN HESSE
============== Second derivative zero for parameter1
MNHESS FAILS AND WILL RETURN DIAGONAL MATRIX.
For completeness, I attach again my code to read data and produce the model. By the way, for the “nOFF” data, should these be declared as “global observables”? My guess is that this is not appropriate, in case I want to throw toys also for this for the FrequentistCalculator. I also attach the full error log in case nruns=5.
Dear all,
I think I found the issue to this problem. It was related to the initialization of the Bi nuisance parameters. I write down the solution in case it may be useful for other people.
The likelihood for this problem reads:
where nON and nOFF are the measured values of the ON and OFF counts, f is the luminosity fraction, S is the signal (POI) and Bi the background. tau is the ratio TOFF/TON.
Bi are the nuisance parameters, while tau is known without any uncertainty.
To find the values of Bi that maximizes the likelihood, one can take the logarithm of the latter, and compute the derivatives. The result is the set of equations:
In the limit of runs having all the same luminosity fraction, i.e. fi=1/Nruns, the solution is given by:
B_i=n_i^{off}/tau_i
The equation for S is more complicated:
In my code, I was performing a wrong initialization of B_i: instead of using n_i^{off}/tau_i, I was writing n_i^{off} * tau_i. This was creating issues in the search for the minimum.